Sep 18

Diagonalization of symmetric matrices

Diagonalization of symmetric matrices

We are looking for conditions under which a matrix in some orthonormal basis takes a diagonal form.

Definition 1. We say that A is diagonalizable if there exists an orthonormal basis U=(u_1,...,u_n) such that A_U=U^{-1}AU is diagonal: A_U=diag[\lambda_1,...,\lambda_n] (all matrices are real).

The usual way of solving a mathematical problem is this. Look for necessary conditions. Then try to find weak sufficient conditions. If they coincide with the necessary ones, you have an exact solution to the problem.

Exercise 1 (necessary condition) Suppose A is diagonalizable by an orthogonal matrix. Then it is symmetric.

Proof. A diagonal matrix is obviously symmetric. Hence, from A=UA_UU^{-1} and using Exercise 1 on the orthogonal matrix inverse A^T=(U^{-1})^TA_U^TU^T =(U^T)^TA_UU^{-1}=A.

Exercise 2. If A is symmetric and a subspace L\subset R^n is invariant with respect to A, then L^\perp is also an invariant subspace of A.

Proof. Let y\in L^\perp. We need to show that Ay\in L^\perp. Take any x\in L. Since L is invariant, we have Ax\in L. Hence, by Exercise 1 0=(Ax)\cdot y=x\cdot(Ay). Since x is arbitrary, it follows that Ay\in L^\perp.

Exercise 3 (sufficient condition) If A is symmetric, then it is diagonalizable by an orthogonal matrix.

Proof. Let A be symmetric. By Exercise 3 A has at least one eigenvector u_1\in R^n. Denote L_1 the set of vectors orthogonal to u_1. It has dimension n-1 and it is invariant by Exercise 2. By Exercise 4 A has an eigenvector u_2 in L_1.

Let L_2 be the set of vectors from L_1 which are orthogonal to u_2. As above, \dim L_2=n-2, L_2 is invariant and in L_2 there is an eigenvector u_3 of A.

Continuing in this way, eventually we obtain a system u_1,...,u_n. Since u_1 is orthogonal to L_1 and all of u_2,u_3,... belong to L_1u_1 is orthogonal to all of u_2,u_3,.... Similarly, at each step u_i is orthogonal to all of u_{i+1},u_{i+2},... Thus, the system u_1,...,u_n is orthogonal. Its elements can be scaled to satisfy \|u_i\|=1 for all i.

The resulting system U=(u_1,...,u_n) will be an orthonormal basis. Its completeness follows from the fact that there are n elements in this system and they are linearly independent.

Denoting \lambda_i the eigenvalue corresponding to u_i, we have Au_i=\lambda_iu_i. This means that A is of diagonal form in the basis U. The transition matrix from the basis consisting of unit vectors e_i to u_i is orthogonal. If x\in R^n, then the same x is the vector of coefficients of x in the basis E=(e_1,...,e_n). By Exercise 3, A_U=U^{-1}AU.