Sep 18

General properties of symmetric matrices

General properties of symmetric matrices

Here we consider properties of symmetric matrices that will be used to prove their diagonalizability.

What are the differences between R^n and C^n?

Vectors in both spaces have n coordinates. In R^n we can multiply vectors by real numbers and in C^n - by complex numbers. This affects the notions of linear combinations, linear independence, dimension and scalar product. We indicate only the differences to watch for.

If in C^n we multiply vectors only by real numbers, it becomes a space of dimension 2n. Let's take n=1 to see why.

Example 1. If we take e=1, then any complex number c=a+ib is a multiple of e: c=c1=ce with the scaling coefficient c. Thus, C is a one-dimensional space in this sense. On the other hand, if only multiplication by real numbers is allowed, then we can take e_1=1, e_2=i as a basis and then c=ae_1+be_2 and C is two-dimensional. To avoid confusion, just use scaling by the right numbers.

The scalar product in R^n is given by x\cdot y=\sum x_iy_i and in C^n by x\cdot y=\sum x_i\bar{y}_i. As a result, for the second scalar product we have x\cdot (c_1y+c_2z)=\bar{c}_1x\cdot y+\bar{c}_2x\cdot z for complex c_1,c_2 (some people call this antilinearity, to distinguish it from linearity x\cdot (ay+bz)=ax\cdot y+bx\cdot z for real a,b).

Definition 1. For a matrix A with possibly complex entries we denote A'=\overline{A^T}. The matrix A' is called an adjoint or a conjugate of A.

Exercise 1. Prove that (Ax)\cdot y=x\cdot(A'y), for any x,y\in C^n.

Proof. For complex numbers we have \overline{\bar{c}}=c, \overline{c_1c_2}=\bar{c}_1\bar{c}_2. Therefore

(Ax)\cdot y=(Ax)^T\bar{y}=x^TA^T\bar{y}=x^T\overline{\overline{(A^T)}}\bar{y}=x^T\overline{(\overline{A^T}y)}=x\cdot(A'y).

Thus, when considering matrices in C^n, conjugation should be used instead of transposition. In particular, instead of symmetry A=A^T the equation A=A' should be used. Matrices satisfying the last equation are called self-adjoint. The theory of self-adjoint matrices in C^n is very similar to that of symmetric matrices in R^n. Keeping in mind two applications (regression analysis and optimization), we consider only square matrices with real entries. Even in this case one is forced to work with C^n from time to time because, in general, eigenvalues can be complex numbers.

General properties of symmetric matrices

A is assumed a square matrix with real entries. When we extend A from R^n to C^n, Ax is defined by the same expression as before but x is allowed to be from C^n and the scalar product in R^n is replaced by the scalar product in C^n. The extension is denoted A_C.

Exercise 2. If A is symmetric, then all eigenvalues of A_C are real.

Proof. Suppose \lambda is an eigenvalue of A_C. Using Exercise 1 and the symmetry of A we have

\lambda x\cdot x=(Ax)\cdot x=x\cdot(Ax)=x\cdot(\lambda x)=\bar{\lambda}x\cdot x.

Since x\cdot x=\|x\|^2>0, we have \lambda =\bar{\lambda}. This shows that \lambda is real.

Exercise 3. If A is symmetric, then it has at least one real eigenvector.

Proof. We know that A_C has at least one complex eigenvalue \lambda. By Exercise 2, this eigenvalue must be real. Thus, we have Ax=\lambda x with some nonzero x\in C^n. Separating real and imaginary parts of x, we have x=u+iv, Au=\lambda u, Av=\lambda v with some u,v\in R^n. At least one of u,v is not zero. Thus a real eigenvector exists.

We need to generalize Exercise 3 to the case when A acts in a subspace. This is done in the next two exercises.

Definition 2. A subspace L is called an invariant subspace of A if AL\subseteq L.

Example 2. If x is an eigenvector of A, then the subspace L spanned by x is an invariant subspace of A. This is because x\in L implies Ax=\lambda x\in L.

Exercise 4. If A is symmetric and L is a non-trivial invariant subspace of A, then A has an eigenvector in L.

Proof. By the definition of an invariant subspace, the restriction of A to L defined by A_Lx=Ax, x\in L, acts from L to L. By Exercise 3, applied to A_L, it has an eigenvector in L, which is also an eigenvector of A.

Exercise 5. a) If \lambda is an (real) eigenvalue of A_R, then it is an eigenvalue of A_C. b) If \lambda is a real eigenvalue of A_C, then it is an eigenvalue of A_R. This is summarized as \sigma(A_R)=\sigma (A_C)\cap R, see the spectrum notation.

See if you can prove this yourself following the ideas used above.