## Generalized Inverses

Let $\mathbf{A}$ be an $m \times n$ matrix. A matrix $\mathbf{G}$ of order $n \times m$ is said to be a generalized inverse (or a g-inverse) of $\mathbf{A}$ if $\mathbf{A G A}=\mathbf{A}$.

If $\mathbf{A}$ is square and nonsingular, then $\mathbf{A}^{-\mathbf{1}}$ is the unique g-inverse of $\mathbf{A}$. Otherwise, A has infinitely many g-inverses, as we will see shortly.

1.1. Let $\mathbf{A}, \mathbf{G}$ be matrices of order $m \times n$ and $n \times m$ respectively. Then the following conditions are equivalent:
(i) $\mathbf{G}$ is a g-inverse of $\mathbf{A}$.
(ii) For any $\mathbf{y} \in \mathcal{C}(\mathbf{A}), \mathbf{x}=\mathbf{G} \mathbf{y}$ is a solution of $\mathbf{A} \mathbf{x}=\mathbf{y}$.

ProoF. (i) $\Rightarrow$ (ii). Any $\mathbf{y} \in \mathcal{C}(\mathbf{A})$ is of the form $\mathbf{y}=\mathbf{A z}$ for some $\mathbf{z}$. Then $\mathbf{A}(\mathbf{G y})=\mathbf{A G A z}=\mathbf{A z}=\mathbf{y}$
(ii) $\Rightarrow$ (i). Since $\mathbf{A G y}=\mathbf{y}$ for any $\mathbf{y} \in \mathcal{C}(\mathbf{A})$ we have $\mathbf{A G A z}=\mathbf{A z}$ for all $\mathbf{z}$. In particular, if we let $\mathbf{z}$ be the $i$ th column of the identity matrix, then we see that the $i$ th columns of AGA and $\mathbf{A}$ are identical. Therefore, $\mathbf{A G A}=\mathbf{A}$.

Let $\mathbf{A}=\mathbf{B C}$ be a rank factorization. We have seen that $\mathbf{B}$ admits a left inverse $\mathbf{B}{\ell}^{-}$, and $\mathbf{C}$ admits a right inverse $\mathbf{C}{\mathbf{r}}^{-} .$Then $\mathbf{G}=\mathbf{C}{\mathbf{r}}^{-} \mathbf{B}{\ell}^{-}$is a g-inverse of $\mathbf{A}$, since
$$\mathbf{A G A}=\mathbf{B C}\left(\mathbf{C}{\mathbf{r}}^{-} \mathbf{B}{\ell}^{-}\right) \mathbf{B C}=\mathbf{B C}=\mathbf{A}$$

Alternatively, if $\mathbf{A}$ has rank $r$, then by $7.3$ of Chapter 1 there exist nonsingular matrices $\mathbf{P}, \mathbf{Q}$ such that
$$\mathbf{A}=\mathbf{P}\left[\begin{array}{cc} \mathbf{I}{r} & \mathbf{0} \ \mathbf{0} & \mathbf{0} \end{array}\right] \mathbf{Q}$$ It can be verified that for $a n y \mathbf{U}, \mathbf{V}, \mathbf{W}$ of appropriate dimensions, $$\left[\begin{array}{cc} \mathbf{I}{r} & \mathbf{U} \ \mathbf{V} & \mathbf{W} \end{array}\right]$$
is a g-inverse of
$$\left[\begin{array}{ll} \mathbf{I}{r} & \mathbf{0} \ \mathbf{0} & \mathbf{0} \end{array}\right]$$ Then $$\mathbf{G}=\mathbf{Q}^{-1}\left[\begin{array}{cc} \mathbf{I}{r} & \mathbf{U} \ \mathbf{V} & \mathbf{W} \end{array}\right] \mathbf{P}^{-1}$$
is a g-inverse of $\mathbf{A}$. This also shows that any matrix that is not a square nonsingular matrix admits infinitely many g-inverses.

Another method that is particularly suitable for computing a g-inverse is as follows. Let $\mathbf{A}$ be of rank $r$. Choose any $r \times r$ nonsingular submatrix of $\mathbf{A}$. For convenience let us assume
$$\mathbf{A}=\left[\begin{array}{ll} \mathbf{A}{11} & \mathbf{A}{12} \ \mathbf{A}{21} & \mathbf{A}{22} \end{array}\right]$$

## Eigenvalues and the Spectral Theorem

Let $\mathbf{A}$ be an $n \times n$ matrix. The determinant $|\mathbf{A}-\lambda \mathbf{I}|$ is a polynomial in the (complex) variable $\lambda$ of degree $n$ and is called the characteristic polynomial of $\mathbf{A}$. The equation
$$|\mathbf{A}-\lambda \mathbf{I}|=0$$
is called the characteristic equation of $\mathbf{A}$. By the fundamental theorem of algebra, the equation has $n$ roots, and these roots are called the eigenvalues of $\mathbf{A}$.

The eigenvalues may not all be distinct. The number of times an eigenvalue occurs as a root of the characteristic equation is called the algebraic multiplicity of the eigenvalue.
We factor the characteristic polynomial as
$$|\mathbf{A}-\lambda \mathbf{I}|=\left(\lambda_{1}-\lambda\right) \cdots\left(\lambda_{n}-\lambda\right)$$
Setting $\lambda=0$ in (4) we see that $|\mathbf{A}|$ is just the product of the eigenvalues of $\mathbf{A}$. Similarly by equating the coefficient of $\lambda^{n-1}$ on either side of (4) we see that the trace of $\mathbf{A}$ equals the sum of the eigenvalues.
A principal submatrix of a square matrix is a submatrix formed by a set of rows and the corresponding set of columns. A principal minor of $\mathbf{A}$ is the determinant of a principal submatrix.
A square matrix $\mathbf{A}$ is called symmetric if $\mathbf{A}=\mathbf{A}^{\prime} .$ An $n \times n$ matrix $\mathbf{A}$ is said to be positive definite if it is symmetric and if for any nonzero vector $\mathbf{x}, \mathbf{x}^{\prime} \mathbf{A} \mathbf{x}>0$
The identity matrix is clearly positive definite and so is a diagonal matrix with only positive entries along the diagonal.

If $\mathbf{A}$ is positive definite, then it is nonsingular.

PROOF. If $\mathbf{A} \mathbf{x}=\mathbf{0}$, then $\mathbf{x}^{\prime} \mathbf{A} \mathbf{x}=0$, and since $\mathbf{A}$ is positive definite, $\mathbf{x}=\mathbf{0}$ Therefore, A must be nonsingular.

The next result is obvious from the definition.

For $0 \leq \alpha \leq 1$, define

For $0 \leq \alpha \leq 1$, define
$$f(\alpha)=|\alpha \mathbf{A}+(1-\alpha) \mathbf{I}|$$

$\alpha \mathbf{A}+(1-\alpha) \mathbf{I}$ is positive definite, and therefore by 8.1, $f(\alpha) \neq 0,0 \leq$ $\alpha \leq 1$. Clearly, $f(0)=1$, and since $f$ is continuous, $f(1)=|\mathbf{A}|>0$

If $\mathbf{A}$ is positive definite, then any principal submatrix of $\mathbf{A}$ is positive definite.

ProoF. Since $\mathbf{A}$ is positive definite, $\mathbf{x}^{\prime} \mathbf{A} \mathbf{x}>0$ for all $\mathbf{x} \neq \mathbf{0}$. Apply this condition to the set of vectors that have zeros in coordinates $j_{1}, \ldots, j_{s} .$ For such a vector $\mathbf{x}, \mathbf{x}^{\prime} \mathbf{A} \mathbf{x}$ reduces to an expression of the type $\mathbf{y}^{\prime} \mathbf{B y}$ where $\mathbf{B}$ is the principal submatrix of A formed by deleting rows and columns $j_{1}, \ldots, j_{s}$ from $\mathbf{A}$. It follows that $\mathbf{B}$, and similarly any principal submatrix of $\mathbf{A}$, is positive definite.

A symmetric $n \times n$ matrix $\mathbf{A}$ is said to be positive semidefinite if $\mathbf{x}^{\prime} \mathbf{A} \mathbf{x} \geq 0$ for all $\mathbf{x} \in R^{n}$

If $\mathbf{A}$ is a symmetric matrix, then the eigenvalues of $\mathbf{A}$ are all real.

PROOF. Suppose $\mu$ is an eigenvalue of $\mathbf{A}$ and let $\mu=\alpha+i \beta$, where $\alpha, \beta$ are real and $i=\sqrt{-1}$. Since $|\mathbf{A}-\mu \mathbf{I}|=0$, we have
$$|(\mathbf{A}-\alpha \mathbf{I})-i \beta \mathbf{I}|=0$$
Taking the complex conjugate of the above determinant and multiplying the two, we get
$$|(\mathbf{A}-\alpha \mathbf{I})-i \beta \mathbf{I} |(\mathbf{A}-\alpha \mathbf{I})+i \beta \mathbf{I}|=0$$
Thus
$$\left|(\mathbf{A}-\alpha \mathbf{I})^{2}+\beta^{2} \mathbf{I}\right|=0$$
Since $\mathbf{A}$ is symmetric, it is true that $\mathbf{A}^{2}$ is positive semidefinite (it follows from the definition that $\mathbf{B B}^{\prime}$ is positive semidefinite for any matrix $\left.\mathbf{B}\right)$. Thus if $\beta \neq 0$, then $\left|(\mathbf{A}-\alpha \mathbf{I})^{2}+\beta^{2} \mathbf{I}\right|$ is positive definite, and then by $\mathbf{8 . 1},(5)$ cannot hold. Thus $\beta=0$, and $\mu$ must be real.

## Frobenius Inequality

Let $\mathbf{B}$ be an $m \times r$ matrix of rank $r$. Then there exists a matrix $\mathbf{X}$ (called a left inverse of B), such that $\mathbf{X B}=\mathbf{I}$.

PROOF. If $m=r$, then $\mathbf{B}$ is nonsingular and admits an inverse. So suppose $r<m$. The columns of $\mathbf{B}$ are linearly independent. Thus we can find a set of $m-r$ columns that together with the columns of $\mathbf{B}$ form a basis for $R^{m}$. In other words, we can find a matrix $\mathbf{U}$ of order $m \times(m-r)$ such that $[\mathbf{B}, \mathbf{U}]$ is nonsingular. Let the inverse of $[\mathbf{B}, \mathbf{U}]$ be partitioned as $\left[\begin{array}{l}\mathbf{X} \ \mathbf{V}\end{array}\right]$, where $\mathbf{X}$ is $r \times m$. Since
$$\left[\begin{array}{l} \mathbf{X} \ \mathbf{V} \end{array}\right][\mathbf{B}, \mathbf{U}]=\mathbf{I}$$
we have $\mathbf{X B}=\mathbf{I}$
We can similarly show that an $r \times n$ matrix $\mathbf{C}$ of rank $r$ has a right inverse, i.e., a matrix $\mathbf{Y}$ such that $\mathbf{C Y}=\mathbf{I}$. Note that a left inverse or a right inverse is not unique, unless the matrix is square and nonsingular.

We can similarly show that an $r \times n$ matrix $\mathbf{C}$ of rank $r$ has a right inverse, i.e., a matrix $\mathbf{Y}$ such that $\mathbf{C Y}=\mathbf{I}$. Note that a left inverse or a rig

Let $\mathbf{B}$ be an $m \times r$ matrix of rank $r$. Then there exists a nonsingular matrix P such that
$$\mathbf{P B}=\left[\begin{array}{l} \mathbf{I} \ \mathbf{0} \end{array}\right]$$

ht inverse is not unique, unless the matrix is square and nonsingular.

Similarly, if $\mathbf{C}$ is $r \times n$ of rank $r$, then there exists a nonsingular matrix $\mathbf{Q}$ such that $\mathbf{C Q}=[\mathbf{I}, \mathbf{0}]$. These two results and the rank factorization (see $\mathbf{4 . 3})$ immediately lead to the following.

## Nonsingularity

Suppose we have $m$ linear equations in the $n$ unknowns $x_{1}, \ldots, x_{n}$. The equations can conveniently be expressed as a single matrix equation $\mathbf{A} \mathbf{x}=\mathbf{b}$, where $\mathbf{A}$ is the $m \times n$ matrix of coefficients. The equation $\mathbf{A} \mathbf{x}=\mathbf{b}$ is said to be consistent if it has at least one solution; otherwise, it is inconsistent. The equation is homogeneous if $\mathbf{b}=\mathbf{0}$. The set of solutions of the homogeneous equation $\mathbf{A x}=\mathbf{0}$ is clearly the null space of $\mathbf{A}$.
If the equation $\mathbf{A x}=\mathbf{b}$ is consistent, then we can write
$$\mathbf{b}=x_{1}^{0} \mathbf{a}{\mathbf{1}}+\cdots+x{n}^{0} \mathbf{a}{\mathbf{n}}$$ for some $x{1}^{0}, \ldots, x_{n}^{0}$, where $\mathbf{a}{1}, \ldots, \mathbf{a}{\mathbf{n}}$ are the columns of $\mathbf{A}$. Thus $\mathbf{b} \in \mathcal{C}(\mathbf{A})$ Conversely, if $\mathbf{b} \in \mathcal{C}(\mathbf{A})$, then $\mathbf{A x}=\mathbf{b}$ must be consistent. If the equation is consistent and if $\mathbf{x}^{0}$ is a solution of the equation, then the set of all solutions of the equation is given by
$$\left{\mathbf{x}^{\mathbf{0}}+\mathbf{x}: \mathbf{x} \in \mathcal{N}(\mathbf{A})\right}$$
Clearly, the equation $\mathbf{A x}=\mathbf{b}$ has either no solution, a unique solution, or infinitely many solutions.

A matrix $\mathbf{A}$ of order $n \times n$ is said to be nonsingular if $R(\mathbf{A})=n$; otherwise, the matrix is singular.

## Orthogonality

Let $S$ be a vector space. A function that assigns a real number $\langle\mathbf{x}, \mathbf{y}\rangle$ to every pair of vectors $\mathbf{x}, \mathbf{y}$ in $S$ is said to be an inner product if it satisfies the following conditions:

(i) $\langle\mathbf{x}, \mathbf{y}\rangle=\langle\mathbf{y}, \mathbf{x}\rangle$
(ii) $\langle\mathbf{x}, \mathbf{x}\rangle \geq 0$ and equality holds if and only if $\mathbf{x}=\mathbf{0}$.
(iii) $\langle c \mathbf{x}, \mathbf{y}\rangle=c\langle\mathbf{x}, \mathbf{y}\rangle .$
(iv) $\langle\mathbf{x}+\mathbf{y}, \mathbf{z}\rangle=\langle\mathbf{x}, \mathbf{z}\rangle+\langle\mathbf{y}, \mathbf{z}\rangle$
In $R^{n},\langle\mathbf{x}, \mathbf{y}\rangle=\mathbf{x}^{\prime} \mathbf{y}=x_{1} y_{1}+\cdots+x_{n} y_{n}$ is easily seen to be an inner product. We will work with this inner product while dealing with $R^{n}$ and its subspaces, unless indicated otherwise.

For a vector $\mathbf{x}$, the positive square root of the inner product $\langle\mathbf{x}, \mathbf{x}\rangle$ is called the norm of $\mathbf{x}$, denoted by $|\mathbf{x}|$. Vectors $\mathbf{x}, \mathbf{y}$ are said to be orthogonal or perpendicular if $\langle\mathbf{x}, \mathbf{y}\rangle=0$, in which case we write $\mathbf{x} \perp \mathbf{y}$.

5.1. If $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{m}}$ are pairwise orthogonal nonzero vectors, then they are linearly independent.
PROOF. $\quad$ Suppose $c_{1} \mathbf{x}{\mathbf{1}}+\cdots+c{m} \mathbf{x}{\mathbf{m}}=\mathbf{0}$. Then $$\left\langle c{1} \mathbf{x}{\mathbf{1}}+\cdots+c{m} \mathbf{x}{\mathbf{m}}, \mathbf{x}{\mathbf{1}}\right\rangle=0$$
and hence,
$$\sum_{i=1}^{m} c_{i}\left\langle\mathbf{x}{\mathbf{i}}, \mathbf{x}{\mathbf{1}}\right\rangle=0$$
Since the vectors $\mathbf{x}{\mathbf{1}}, \ldots, \mathbf{x}{\mathbf{m}}$ are pairwise orthogonal, it follows that $c_{1}\left\langle\mathbf{x}{\mathbf{1}}, \mathbf{x}{\mathbf{1}}\right\rangle=0$ and since $\mathbf{x}{\mathbf{1}}$ is nonzero, $c{1}=0 .$ Similarly, we can show that each $c_{i}$ is zero. Therefore, the vectors are linearly independent.

A set of vectors $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{m}}$ is said to form an orthonormal basis for the vector space $S$ if the set is a basis for $S$ and furthermore, $\left\langle\mathbf{x}{\mathbf{i}}, \mathbf{x}{\mathbf{j}}\right\rangle$ is 0 if $i \neq j$ and 1 if $i=j$

We now describe the Gram-Schmidt procedure, which produces an orthonormal basis starting with a given basis $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{n}}$
Set $\mathbf{y}{\mathbf{1}}=\mathbf{x}{\mathbf{1}} .$ Having defined $\mathbf{y}{1}, \ldots, \mathbf{y}{\mathbf{i}-\mathbf{1}}$, we define
$$\mathbf{y}{\mathbf{i}}=\mathbf{x}{\mathbf{i}}-a_{i, i-1} \mathbf{y}{\mathbf{i}-\mathbf{1}}-\cdots-a{i 1} \mathbf{y}{\mathbf{1}}$$ where $a{i, i-1}, \ldots, a_{i 1}$ are chosen so that $\mathbf{y}{\mathbf{i}}$ is orthogonal to $\mathbf{y}{1}, \ldots, \mathbf{y}{\mathbf{i}-\mathbf{1}}$. Thus we must solve $\left\langle\mathbf{y}{\mathbf{i}}, \mathbf{y}{\mathbf{j}}\right\rangle=0, j=1, \ldots, i-1 .$ This leads to $$\left\langle\mathbf{x}{\mathbf{i}}-a_{i, i-1} \mathbf{y}{\mathbf{i}-\mathbf{1}}-\cdots-a{i 1} \mathbf{y}{\mathbf{1}}, \mathbf{y}{\mathbf{j}}\right\rangle=0, \quad j=1, \ldots, i-1$$

which gives
$$\left\langle\mathbf{x}{\mathbf{i}}, \mathbf{y}{\mathbf{j}}\right\rangle-\sum_{k=1}^{i-1} a_{i k}\left\langle\mathbf{y}{\mathbf{k}}, \mathbf{y}{\mathbf{j}}\right\rangle=0, \quad j=1, \ldots, i-1$$
Now, since $\mathbf{y}{1}, \ldots, \mathbf{y}{\mathbf{i}-\mathbf{1}}$ is an orthogonal set, we get
$$\left\langle\mathbf{x}{\mathbf{i}}, \mathbf{y}{\mathbf{j}}\right\rangle-a_{i j}\left\langle\mathbf{y}{\mathbf{j}}, \mathbf{y}{\mathbf{j}}\right\rangle=0$$

and hence,
$$a_{i j}=\frac{\left\langle\mathbf{x}{\mathbf{i}}, \mathbf{y}{\mathbf{j}}\right\rangle}{\left\langle\mathbf{y}{\mathbf{j}}, \mathbf{y}{\mathbf{j}}\right\rangle}, \quad j=1, \ldots, i-1$$
The process is continued to obtain the basis $\mathbf{y}{1}, \ldots, \mathbf{y}{\mathbf{n}}$ of pairwise orthogonal vectors. Since $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{n}}$ are linearly independent, each $\mathbf{y}{\mathbf{i}}$ is nonzero. Now if we set $\mathbf{z}{\mathbf{i}}=\frac{\mathbf{y}{\mathrm{i}}}{\left|\mathbf{y}{\mathrm{i}}\right|}$, then $\mathbf{z}{1}, \ldots, \mathbf{z}{\mathbf{n}}$ is an orthonormal basis. Note that the linear span of $\mathbf{z}{1}, \ldots, \mathbf{z}{\mathbf{i}}$ equals the linear span of $\mathbf{x}{\mathbf{1}}, \ldots, \mathbf{x}{\mathbf{i}}$ for each $i$

We remark that given a set of linearly independent vectors $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{m}}$, the GramSchmidt procedure described above can be used to produce a pairwise orthogonal set $\mathbf{y}{1}, \ldots, \mathbf{y}{\mathbf{m}}$, such that $\mathbf{y}{\mathbf{i}}$ is a linear combination of $\mathbf{x}{\mathbf{1}}, \ldots, \mathbf{x}_{\mathbf{i}-\mathbf{1}}, i=1, \ldots, m$ This fact is used in the proof of the next result.

Let $W$ be a set (not necessarily a subspace) of vectors in a vector space $S$. We define
$$W^{\perp}={\mathbf{x}: \mathbf{x} \in S,\langle\mathbf{x}, \mathbf{y}\rangle=0 \text { for all } \mathbf{y} \in W}$$
It follows from the definitions that $W^{\perp}$ is a subspace of $S$.

## Rank

Let $\mathbf{A}$ be an $m \times n$ matrix. The subspace of $R^{m}$ spanned by the column vectors of $\mathbf{A}$ is called the column space or the column span of $\mathbf{A}$ and is denoted by $\mathcal{C}(\mathbf{A})$. Similarly, the subspace of $R^{n}$ spanned by the row vectors of $\mathbf{A}$ is called the row space of $\mathbf{A}$, denoted by $\mathcal{R}(\mathbf{A})$. Clearly, $\mathcal{R}(\mathbf{A})$ is isomorphic to $\mathcal{C}\left(\mathbf{A}^{\prime}\right)$. The dimension

of the column space is called the column rank, whereas the dimension of the row space is called the row rank of the matrix. These two definitions turn out be very short-lived in any linear algebra book, since the two ranks are always equal, as we show in the next result.

The following operations performed on a matrix $\mathbf{A}$ are called elementary column operations:
(i) Interchange two columns of $\mathbf{A}$.
(ii) Multiply a column of $\mathbf{A}$ by a nonzero scalar.
(iii) Add a scalar multiple of one column to another column.
These operations clearly leave $\mathcal{C}(\mathbf{A})$ unaffected, and therefore they do not change the rank of the matrix. We can define elementary row operations similarly. The elementary row and column operations are particularly useful in computations. Thus to find the rank of a matrix we first reduce it to a matrix with several zeros by these operations and then compute the rank of the resulting matrix.

## Basis and Dimension

The linear span of (or the space spanned by) the vectors $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{m}}$ is defined to be the set of all linear combinations $c_{1} \mathbf{x}{\mathbf{1}}+\cdots+c{m} \mathbf{x}{\mathbf{m}}$ where $c{1}, \ldots, c_{m}$ are real numbers. The linear span is a subspace; this follows from the definition.

A set of vectors $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{m}}$ is said to be linearly dependent if there exist real numbers $c_{1}, \ldots, c_{m}$ such that at least one $c_{i}$ is nonzero and $c_{1} \mathbf{x}{\mathbf{1}}+\cdots+c{m} \mathbf{x}{\mathbf{m}}=\mathbf{0}$ A set is linearly independent if it is not linearly dependent. Strictly speaking, we should refer to a collection (or a multiset) of vectors rather than a set of vectors in the two preceding definitions. Thus when we talk of vectors $\mathbf{x}{1}, \ldots, \mathbf{x}_{\mathbf{m}}$ being linearly dependent or independent, we allow for the possibility of the vectors not necessarily being distinct.
The following statements are easily proved:
(i) The set consisting of the zero vector alone is linearly dependent.
(ii) If $X \subset Y$ and if $X$ is linearly dependent, then so is $Y$.
(iii) If $X \subset Y$ and if $Y$ is linearly independent, then so is $X$.
A set of vectors is said to form a basis for the vector space $S$ if it is linearly independent and its linear span equals $S$.

Let $\mathbf{e}{\mathbf{i}}$ be the $i$ th column of the $n \times n$ identity matrix. The set $\mathbf{e}{1}, \ldots, \mathbf{e}_{\mathbf{n}}$ forms a basis for $R^{n}$, called the standard basis.

If $\mathbf{x}{1}, \ldots, \mathbf{x}{\mathbf{m}}$ is a basis for $S$, then any vector $\mathbf{x}$ in $S$ admits a unique representation as a linear combination $c_{1} \mathbf{x}{\mathbf{1}}+\cdots+c{m} \mathbf{x}{\mathbf{m}}$. For if $$\mathbf{x}=c{1} \mathbf{x}{\mathbf{1}}+\cdots+c{m} \mathbf{x}{\mathbf{m}}=d{1} \mathbf{x}{\mathbf{1}}+\cdots+d{m} \mathbf{x}{\mathbf{m}}$$ then $$\left(c{1}-d_{1}\right) \mathbf{x}{\mathbf{1}}+\cdots+\left(c{m}-d_{m}\right) \mathbf{x}{\mathbf{m}}=\mathbf{0}$$ and since $\mathbf{x}{\mathbf{1}}, \ldots, \mathbf{x}{\mathbf{m}}$ are linearly independent, $c{i}=d_{i}$ for each $i$.
A vector space is said to be finite-dimensional if it has a basis consisting of finitely many vectors. The vector space containing only the zero vector is also finite-dimensional. We will consider only finite-dimensional vector spaces. Very often it will be implicitly assumed that the vector spaces under consideration are nontrivial, i.e., contain vectors other than the zero vector.

## Vector Spaces and Subspaces

A nonempty set $S$ is called a vector space if it satisfies the following conditions:
(i) For any $\mathbf{x}, \mathbf{y}$ in $S, \mathbf{x}+\mathbf{y}$ is defined and is in $S$. Furthermore,
\begin{aligned} \mathbf{x}+\mathbf{y} &=\mathbf{y}+\mathbf{x}, \quad \text { (commutativity) } \ \mathbf{x}+(\mathbf{y}+\mathbf{z}) &=(\mathbf{x}+\mathbf{y})+\mathbf{z} . \quad \text { (associativity) } \end{aligned}
(ii) There exists an element in $S$, denoted by $\mathbf{0}$, such that $\mathbf{x}+\mathbf{0}=\mathbf{x}$ for all $\mathbf{x}$.
(iii) For any $\mathbf{x}$ in $S$, there exists an element $\mathbf{y}$ in $S$ such that $\mathbf{x}+\mathbf{y}=\mathbf{0}$.
(iv) For any $\mathbf{x}$ in $S$ and any real number $c, c \mathbf{x}$ is defined and is in $S ;$ moreover, $1 \mathbf{x}=\mathbf{x}$ for any $\mathbf{x}$
(v) For any $\mathbf{x}{\mathbf{1}}, \mathbf{x}{\mathbf{2}}$ in $S$ and real numbers $c_{1}, c_{2}, c_{1}\left(\mathbf{x}{1}+\mathbf{x}{\mathbf{2}}\right)=c_{1} \mathbf{x}{\mathbf{1}}+c{1} \mathbf{x}{\mathbf{2}},\left(c{1}+\right.$ $\left.c_{2}\right) \mathbf{x}{\mathbf{1}}=c{1} \mathbf{x}{\mathbf{1}}+c{2} \mathbf{x}{\mathbf{1}}$ and $c{1}\left(c_{2} \mathbf{x}{\mathbf{1}}\right)=\left(c{1} c_{2}\right) \mathbf{x}_{\mathbf{1}}$

Elements in $S$ are called vectors. If $\mathbf{x}, \mathbf{y}$ are vectors, then the operation of taking their sum $\mathbf{x}+\mathbf{y}$ is referred to as vector addition. The vector in (ii) is called the $z$ ero vector. The operation in (iv) is called scalar multiplication. A vector space may be defined with reference to any field. We have taken the field to be the field of real numbers as this will be sufficient for our purpose.

The set of column vectors of order $n$ (or $n \times 1$ matrices) is a vector space. So is the set of row vectors of order $n$. These two vector spaces are the ones we consider most of the time.
Let $R^{n}$ denote the set $R \times R \times \cdots \times R$, taken $n$ times, where $R$ is the set of real numbers. We will write elements of $R^{n}$ either as column vectors or as row vectors depending upon whichever is convenient in a given situation.
If $S, T$ are vector spaces and $S \subset T$, then $S$ is called a subspace of $T$. Let us describe all possible subspaces of $R^{3}$. Clearly, $R^{3}$ is a vector space, and so is the space consisting of only the zero vector, i.e., the vector of all zeros. Let $c_{1}, c_{2}, c_{3}$ be real numbers. The set of all vectors $\mathbf{x} \in R^{3}$ that satisfy
$$c_{1} x_{1}+c_{2} x_{2}+c_{3} x_{3}=0$$
is a subspace of $R^{3}$ (Here $x_{1}, x_{2}, x_{3}$ are the coordinates of $\left.\mathbf{x}\right)$. Geometrically, this set represents a plane passing through the origin. Intersection of two distinct planes through the origin is a straight line through the origin and is also a subspace. These are the only possible subspaces of $R^{3}$

## Vector Spaces and Matrices

We consider only real matrices. Although our treatment is self-contained, the reader is assumed to be familiar with basic operations on matrices. We also assume knowledge of elementary properties of the determinant.

An $m \times n$ matrix consists of $m n$ real numbers arranged in $m$ rows and $n$ columns. We denote matrices by bold letters. The entry in row $i$ and column $j$ of the matrix A is denoted by $a_{i j}$. An $m \times 1$ matrix is called a column vector of order $m$; similarly, a $1 \times n$ matrix is a row vector of order $n$. An $m \times n$ matrix is called a square matrix if $m=n$

If $\mathbf{A}, \mathbf{B}$ are $m \times n$ matrices, then $\mathbf{A}+\mathbf{B}$ is defined as the $m \times n$ matrix with $(i, j)$-entry $a_{i j}+b_{i j}$. If $\mathbf{A}$ is a matrix and $c$ is a real number, then $c \mathbf{A}$ is obtained by multiplying each element of $\mathbf{A}$ by $c$.

If $\mathbf{A}$ is $m \times p$ and $\mathbf{B}$ is $p \times n$, then their product $\mathbf{C}=\mathbf{A} \mathbf{B}$ is an $m \times n$ matrix with $(i, j)$-entry given by
$$c_{i j}=\sum_{k=1}^{p} a_{i k} b_{k j}$$
The following properties hold:
\begin{aligned} (\mathbf{A} \mathbf{B}) \mathbf{C} &=\mathbf{A}(\mathbf{B C}) \ \mathbf{A}(\mathbf{B}+\mathbf{C}) &=\mathbf{A} \mathbf{B}+\mathbf{A C} \ (\mathbf{A}+\mathbf{B}) \mathbf{C} &=\mathbf{A} \mathbf{C}+\mathbf{B C} \end{aligned}

The transpose of the $m \times n$ matrix $\mathbf{A}$, denoted by $\mathbf{A}^{\prime}$, is the $n \times m$ matrix whose $(i, j)$-entry is $a_{j i}$. It can be verified that $\left(\mathbf{A}^{\prime}\right)^{\prime}=\mathbf{A},(\mathbf{A}+\mathbf{B})^{\prime}=\mathbf{A}^{\prime}+\mathbf{B}^{\prime},(\mathbf{A} \mathbf{B})^{\prime}=$ $\mathbf{B}^{\prime} \mathbf{A}^{\prime}$

A good understanding of the definition of matrix multiplication is quite useful. We note some simple facts that are often required. We assume that all products occurring here are defined in the sense that the orders of the matrices make them compatible for multiplication.
(i) The $j$ th column of $\mathbf{A B}$ is the same as $\mathbf{A}$ multiplied by the $j$ th column of $\mathbf{B}$.
(ii) The $i$ th row of $\mathbf{A B}$ is the same as the $i$ th row of $\mathbf{A}$ multiplied by $\mathbf{B}$.
(iii) The $(i, j)$-entry of $\mathbf{A B C}$ is obtained as
$$\left(x_{1}, \ldots, x_{p}\right) \mathbf{B}\left[\begin{array}{c} y_{1} \ \vdots \ y_{q} \end{array}\right]$$
where $\left(x_{1}, \ldots, x_{p}\right)$ is the $i$ th row of $\mathbf{A}$ and $\left(y_{1}, \ldots, y_{q}\right)^{\prime}$ is the $j$ th column of C.
(iv) If $\mathbf{A}=\left[\mathbf{a}{1}, \ldots, \mathbf{a}{\mathbf{n}}\right]$ and
$$\mathbf{B}=\left[\begin{array}{c} \mathbf{b}{\mathbf{1}}^{\prime} \ \vdots \ \mathbf{b}{\mathbf{n}}^{\prime} \end{array}\right]$$
where $\mathbf{a}{\mathbf{i}}$ denote columns of $\mathbf{A}$ and $\mathbf{b}{\mathbf{j}}^{\prime}$ denote rows of $\mathbf{B}$, then
$$\mathbf{A} \mathbf{B}=\mathbf{a}{1} \mathbf{b}{\mathbf{1}}^{\prime}+\cdots+\mathbf{a}{\mathbf{n}} \mathbf{b}{\mathbf{n}}^{\prime}$$

A diagonal matrix is a square matrix $\mathbf{A}$ such that $a_{i j}=0, i \neq j$. We denote the diagonal matrix
$$\left[\begin{array}{cccc} \lambda_{1} & 0 & \cdots & 0 \ 0 & \lambda_{2} & \cdots & 0 \ \vdots & \vdots & \ddots & \vdots \ 0 & 0 & \cdots & \lambda_{n} \end{array}\right]$$
by $\operatorname{diag}\left(\lambda_{1}, \ldots, \lambda_{n}\right)$. When $\lambda_{i}=1$ for all $i$, this matrix reduces to the identity matrix of order $n$, which we denote by $\mathbf{I}{\mathbf{n}}$, or often simply by $\mathbf{I}$ if the order is clear from the context. Observe that for any square matrix $\mathbf{A}$, we have $\mathbf{A I}=\mathbf{I A}=\mathbf{A}$ The entries $a{11}, \ldots, a_{n n}$ are said to constitute the (main) diagonal entries of $\mathbf{A}$. The trace of $\mathbf{A}$ is defined as
$$\operatorname{trace} \mathbf{A}=a_{11}+\cdots+a_{n n}$$
It follows from this definition that if $\mathbf{A}, \mathbf{B}$ are matrices such that both $\mathbf{A B}$ and $\mathbf{B A}$ are defined, then
$$\operatorname{trace} \mathbf{A} \mathbf{B}=\operatorname{trace} \mathbf{B} \mathbf{A}$$

The determinant of an $n \times n$ matrix $\mathbf{A}$, denoted by $|\mathbf{A}|$, is defined as
$$|\mathbf{A}|=\sum_{\sigma} \epsilon(\sigma) a_{1 \sigma(1)} \cdots a_{n \sigma(n)}$$
where the summation is over all permutations ${\sigma(1), \ldots, \sigma(n)}$ of ${1, \ldots, n}$ and $\epsilon(\sigma)$ is 1 or $-1$ according as $\sigma$ is even or odd.
We state some basic properties of the determinant without proof:
(i) The determinant can be evaluated by expansion along a row or a column. Thus, expanding along the first row,
$$|\mathbf{A}|=\sum_{j=1}^{n}(-1)^{1+j} a_{1 j}\left|\mathbf{A}{\mathbf{1 j}}\right|$$ where $\mathbf{A}{1 j}$ is the submatrix obtained by deleting the first row and the $j$ th column of $\mathbf{A}$. We also note that
$$\sum_{j=1}^{n}(-1)^{1+j} a_{i j}\left|\mathbf{A}_{\mathbf{1 j}}\right|=0, \quad i=2, \ldots, n$$

(ii) The determinant changes sign if two rows (or columns) are interchanged.
(iii) The determinant is unchanged if a constant multiple of one row is added to another row. A similar property is true for columns.
(iv) The determinant is a linear function of any column (row) when all the other columns (rows) are held fixed.
(v) $|\mathbf{A} \mathbf{B}|=|\mathbf{A}||\mathbf{B}|$
The matrix $\mathbf{A}$ is upper triangular if $a_{i j}=0, i>j$. The transpose of an upper triangular matrix is lower triangular.

It will often be necessary to work with matrices in partitioned form. For example, let
$$\mathbf{A}=\left[\begin{array}{ll} \mathbf{A}{11} & \mathbf{A}{12} \ \mathbf{A}{21} & \mathbf{A}{22} \end{array}\right], \quad \mathbf{B}=\left[\begin{array}{ll} \mathbf{B}{11} & \mathbf{B}{12} \ \mathbf{B}{21} & \mathbf{B}{22} \end{array}\right]$$
be two matrices where each $\mathbf{A}{\mathbf{i j}}, \mathbf{B}{\mathbf{i j}}$ is itself a matrix. If compatibility for matrix multiplication is assumed throughout (in which case, we say that the matrices are partitioned conformally), then we can write
$$\mathbf{A} \mathbf{B}=\left[\begin{array}{ll} \mathbf{A}{11} \mathbf{B}{11}+\mathbf{A}{12} \mathbf{B}{21} & \mathbf{A}{11} \mathbf{B}{12}+\mathbf{A}{12} \mathbf{B}{22} \ \mathbf{A}{21} \mathbf{B}{11}+\mathbf{A}{22} \mathbf{B}{21} & \mathbf{A}{21} \mathbf{B}{12}+\mathbf{A}{22} \mathbf{B}{22} \end{array}\right]$$

Theme: Overlay by Kaira UprivateTA™是一个服务全球中国留学生的理科代写品牌，帮助客户轻松完成各项繁琐的作业，拿到满意的GPA。 我们能够高效高品质的帮您完成各类代写服务，或者提供答疑辅导服务，为您的留学生涯保驾护航。 各专业各科目各种形式的作业代写, 数学代考, 数学网课代修服务 顶尖大学Phd学位的Tutors为您服务 令人满意的成绩，完善的售后服务, 合理的价格, 经常出现的优惠活动
APM, 1507, Millennium City 5, 418 Kwun Tong Rd,Kowloon Tong,HongKong