# Vector Spaces and Matrices

We consider only real matrices. Although our treatment is self-contained, the reader is assumed to be familiar with basic operations on matrices. We also assume knowledge of elementary properties of the determinant.

An $m \times n$ matrix consists of $m n$ real numbers arranged in $m$ rows and $n$ columns. We denote matrices by bold letters. The entry in row $i$ and column $j$ of the matrix A is denoted by $a_{i j}$. An $m \times 1$ matrix is called a column vector of order $m$; similarly, a $1 \times n$ matrix is a row vector of order $n$. An $m \times n$ matrix is called a square matrix if $m=n$

If $\mathbf{A}, \mathbf{B}$ are $m \times n$ matrices, then $\mathbf{A}+\mathbf{B}$ is defined as the $m \times n$ matrix with $(i, j)$-entry $a_{i j}+b_{i j}$. If $\mathbf{A}$ is a matrix and $c$ is a real number, then $c \mathbf{A}$ is obtained by multiplying each element of $\mathbf{A}$ by $c$.

If $\mathbf{A}$ is $m \times p$ and $\mathbf{B}$ is $p \times n$, then their product $\mathbf{C}=\mathbf{A} \mathbf{B}$ is an $m \times n$ matrix with $(i, j)$-entry given by
$$c_{i j}=\sum_{k=1}^{p} a_{i k} b_{k j}$$
The following properties hold:
\begin{aligned} (\mathbf{A} \mathbf{B}) \mathbf{C} &=\mathbf{A}(\mathbf{B C}) \ \mathbf{A}(\mathbf{B}+\mathbf{C}) &=\mathbf{A} \mathbf{B}+\mathbf{A C} \ (\mathbf{A}+\mathbf{B}) \mathbf{C} &=\mathbf{A} \mathbf{C}+\mathbf{B C} \end{aligned}

The transpose of the $m \times n$ matrix $\mathbf{A}$, denoted by $\mathbf{A}^{\prime}$, is the $n \times m$ matrix whose $(i, j)$-entry is $a_{j i}$. It can be verified that $\left(\mathbf{A}^{\prime}\right)^{\prime}=\mathbf{A},(\mathbf{A}+\mathbf{B})^{\prime}=\mathbf{A}^{\prime}+\mathbf{B}^{\prime},(\mathbf{A} \mathbf{B})^{\prime}=$ $\mathbf{B}^{\prime} \mathbf{A}^{\prime}$

A good understanding of the definition of matrix multiplication is quite useful. We note some simple facts that are often required. We assume that all products occurring here are defined in the sense that the orders of the matrices make them compatible for multiplication.
(i) The $j$ th column of $\mathbf{A B}$ is the same as $\mathbf{A}$ multiplied by the $j$ th column of $\mathbf{B}$.
(ii) The $i$ th row of $\mathbf{A B}$ is the same as the $i$ th row of $\mathbf{A}$ multiplied by $\mathbf{B}$.
(iii) The $(i, j)$-entry of $\mathbf{A B C}$ is obtained as
$$\left(x_{1}, \ldots, x_{p}\right) \mathbf{B}\left[\begin{array}{c} y_{1} \ \vdots \ y_{q} \end{array}\right]$$
where $\left(x_{1}, \ldots, x_{p}\right)$ is the $i$ th row of $\mathbf{A}$ and $\left(y_{1}, \ldots, y_{q}\right)^{\prime}$ is the $j$ th column of C.
(iv) If $\mathbf{A}=\left[\mathbf{a}{1}, \ldots, \mathbf{a}{\mathbf{n}}\right]$ and
$$\mathbf{B}=\left[\begin{array}{c} \mathbf{b}{\mathbf{1}}^{\prime} \ \vdots \ \mathbf{b}{\mathbf{n}}^{\prime} \end{array}\right]$$
where $\mathbf{a}{\mathbf{i}}$ denote columns of $\mathbf{A}$ and $\mathbf{b}{\mathbf{j}}^{\prime}$ denote rows of $\mathbf{B}$, then
$$\mathbf{A} \mathbf{B}=\mathbf{a}{1} \mathbf{b}{\mathbf{1}}^{\prime}+\cdots+\mathbf{a}{\mathbf{n}} \mathbf{b}{\mathbf{n}}^{\prime}$$

A diagonal matrix is a square matrix $\mathbf{A}$ such that $a_{i j}=0, i \neq j$. We denote the diagonal matrix
$$\left[\begin{array}{cccc} \lambda_{1} & 0 & \cdots & 0 \ 0 & \lambda_{2} & \cdots & 0 \ \vdots & \vdots & \ddots & \vdots \ 0 & 0 & \cdots & \lambda_{n} \end{array}\right]$$
by $\operatorname{diag}\left(\lambda_{1}, \ldots, \lambda_{n}\right)$. When $\lambda_{i}=1$ for all $i$, this matrix reduces to the identity matrix of order $n$, which we denote by $\mathbf{I}{\mathbf{n}}$, or often simply by $\mathbf{I}$ if the order is clear from the context. Observe that for any square matrix $\mathbf{A}$, we have $\mathbf{A I}=\mathbf{I A}=\mathbf{A}$ The entries $a{11}, \ldots, a_{n n}$ are said to constitute the (main) diagonal entries of $\mathbf{A}$. The trace of $\mathbf{A}$ is defined as
$$\operatorname{trace} \mathbf{A}=a_{11}+\cdots+a_{n n}$$
It follows from this definition that if $\mathbf{A}, \mathbf{B}$ are matrices such that both $\mathbf{A B}$ and $\mathbf{B A}$ are defined, then
$$\operatorname{trace} \mathbf{A} \mathbf{B}=\operatorname{trace} \mathbf{B} \mathbf{A}$$

The determinant of an $n \times n$ matrix $\mathbf{A}$, denoted by $|\mathbf{A}|$, is defined as
$$|\mathbf{A}|=\sum_{\sigma} \epsilon(\sigma) a_{1 \sigma(1)} \cdots a_{n \sigma(n)}$$
where the summation is over all permutations ${\sigma(1), \ldots, \sigma(n)}$ of ${1, \ldots, n}$ and $\epsilon(\sigma)$ is 1 or $-1$ according as $\sigma$ is even or odd.
We state some basic properties of the determinant without proof:
(i) The determinant can be evaluated by expansion along a row or a column. Thus, expanding along the first row,
$$|\mathbf{A}|=\sum_{j=1}^{n}(-1)^{1+j} a_{1 j}\left|\mathbf{A}{\mathbf{1 j}}\right|$$ where $\mathbf{A}{1 j}$ is the submatrix obtained by deleting the first row and the $j$ th column of $\mathbf{A}$. We also note that
$$\sum_{j=1}^{n}(-1)^{1+j} a_{i j}\left|\mathbf{A}_{\mathbf{1 j}}\right|=0, \quad i=2, \ldots, n$$

(ii) The determinant changes sign if two rows (or columns) are interchanged.
(iii) The determinant is unchanged if a constant multiple of one row is added to another row. A similar property is true for columns.
(iv) The determinant is a linear function of any column (row) when all the other columns (rows) are held fixed.
(v) $|\mathbf{A} \mathbf{B}|=|\mathbf{A}||\mathbf{B}|$
The matrix $\mathbf{A}$ is upper triangular if $a_{i j}=0, i>j$. The transpose of an upper triangular matrix is lower triangular.

It will often be necessary to work with matrices in partitioned form. For example, let
$$\mathbf{A}=\left[\begin{array}{ll} \mathbf{A}{11} & \mathbf{A}{12} \ \mathbf{A}{21} & \mathbf{A}{22} \end{array}\right], \quad \mathbf{B}=\left[\begin{array}{ll} \mathbf{B}{11} & \mathbf{B}{12} \ \mathbf{B}{21} & \mathbf{B}{22} \end{array}\right]$$
be two matrices where each $\mathbf{A}{\mathbf{i j}}, \mathbf{B}{\mathbf{i j}}$ is itself a matrix. If compatibility for matrix multiplication is assumed throughout (in which case, we say that the matrices are partitioned conformally), then we can write
$$\mathbf{A} \mathbf{B}=\left[\begin{array}{ll} \mathbf{A}{11} \mathbf{B}{11}+\mathbf{A}{12} \mathbf{B}{21} & \mathbf{A}{11} \mathbf{B}{12}+\mathbf{A}{12} \mathbf{B}{22} \ \mathbf{A}{21} \mathbf{B}{11}+\mathbf{A}{22} \mathbf{B}{21} & \mathbf{A}{21} \mathbf{B}{12}+\mathbf{A}{22} \mathbf{B}{22} \end{array}\right]$$