Aug 18

Orthogonal matrices

Orthogonal matrices

Definition 1. A square matrix A is called orthogonal if A^TA=I.

Exercise 1. Let A be orthogonal. Then a) A^{-1}=A^T, b) the transpose A^T is orthogonal, c) the inverse A^{-1} is orthogonal, d) |\det A|=1.

Proof. a) A^T is the left inverse of A. Hence, A is invertible and its inverse is A^T. b) AA^T=I from the inverse definition. Part c) follows from parts a) and b). d) Just apply \det to the definition to get (\det A)^{2}=1.

Exercise 2. An orthogonal matrix preserves scalar products, norms and angles.

Proof. For any vectors x,y scalar products are preserved: (Ax)\cdot(Ay)=(A^TAx)\cdot(y)=x\cdot y. Therefore vector lengths are preserved: \|Ax\| =\|x\|. Cosines of angles are preserved too, because \frac{(Ax)\cdot(Ay)}{\|Ax\|\|Ay\|}=\frac{x\cdot y}{\|x\|\| y\|}. Thus angles are preserved.

Since the origin is unchanged under any linear mapping, A0=0, Exercise 2 gives the following geometric interpretation of an orthogonal matrix: it is rotation around the origin (angles and vector lengths are preserved, while the origin stays in place). Strictly speaking, in case \det A=1 we have rotation and in case \det A=-1 - rotation combined with reflection.

Another interpretation is suggested by the next exercise.

Exercise 3. If u_1,...,u_n is an orthonormal basis, then the matrix U=(u_1,...,u_n) is orthogonal. Conversely, rows or columns of an orthogonal matrix form an orthonormal basis.

Proof. Orthonormality means that u_i^Tu_j=1 if i=j and u_i^Tu_j=0, if i\neq j. These equations are equivalent to orthogonality of U:

(1) U^TU=\left(\begin{array}{c}u_1^T \\... \\u_n^T\end{array}\right)\left(u_1,...,u_n\right)=I.

Exercise 4. Let u_1,...,u_n and v_1,...,v_n be two orthonormal bases. Let A=V^{-1}U be the transition matrix from coordinates \xi in the basis u_1,...,u_n to coordinates \xi^\prime in the basis v_1,...,v_n. Then A is orthogonal.

Proof. By Exercise 3, both U and V are orthogonal. Hence, by Exercise 1 V^{-1} is orthogonal. It suffices to show that a product of two orthogonal matrices M,N is orthogonal: (MN)^TMN=N^TM^TMN=N^TN=I.

Aug 18

Basis and dimension

Basis and dimension

Definition 1. We say that vectors x^{(1)},...,x^{(m)} form a basis in a subspace L if 1) it is spanned by x^{(1)},...,x^{(m)} and 2) these vectors are linearly independent. The number of vectors in the basis is called a dimension of L and the notation is \dim L.

An orthogonal basis is a special type of a basis, when in addition to the above conditions 1)-2) the basis vectors are orthonormal. For the dimension definition to be correct, the number of vectors in any basis should be the same. We prove correctness in a separate post.

Exercise 1. In R^n the unit vectors are linearly independent. Prove this fact 1) directly and 2) using the properties of an orthonormal system.

Direct proof. Any x\in R^n can be represented as

(1) x=x_1e_1+...+x_ne_n.

If the right side is zero, then x=0 and all x_i are zero.

Proof using orthonormality.  If the right side in (1) is zero, then x_i=x\cdot e_i=0 for all i.

Exercise 2. \dim R^{n}=n.

Proof. (1) shows that R^n is spanned by e_1,...,e_n. Besides, they are linearly independent by Exercise 1.

Definition 2. Let L_1,\ L_2 be two subspaces such that any element of one is orthogonal to any element of the other. Then the set \{x_1+x_2:x_1\in L_1,\ x_2\in L_2\} is called an orthogonal sum of L_1,\ L_2 and denoted L=L_1\oplus L_2.

Exercise 3. If a vector x belongs to both terms in the orthogonal sum of two subspaces L=L_1\oplus L_2, then it is zero. This means that L_1\cap L_2=\{0\}.

Proof. This is because any element of L_1 is orthogonal to any element of L_2, so x is orthogonal to itself, 0=x\cdot x=\|x\|^2 and x=0.

Exercise 4 (dimension additivity) Let L=L_1\oplus L_2 be an orthogonal sum of two subspaces. Then \dim L=\dim L_1+\dim L_2.

Proof. Let l_i=\dim L_i, i=1,2. By definition, L_1 is spanned by some linearly independent vectors y^{(1)},...,y^{(l_{1})} and L_2 is spanned by some linearly independent vectors z^{(1)},...,z^{(l_{2})}. Any x\in L can be decomposed as x=y+z, y\in L_1, z\in L_2. Since y,z can be further decomposed as y=\sum_{i=1}^{l_1}a_iy^{(i)}, z=\sum_{i=1}^{l_2}b_iz^{(i)}, the system y^{(1)},...,y^{(l_1)}, z^{(1)},...,z^{(l_2)} spans L.

Moreover, this system is linearly independent. If



L_1\ni\sum_{i=1}^{l_1}a_iy^{(i)}=-\sum_{i=1}^{l_2}b_iz^{(i)}\in L_2.

By Exercise 3 then \sum_{i=1}^{l_1}a_iy^{(i)}=0, \sum_{i=1}^{l_2}b_iz^{(i)}=0. By linear independence of the vectors in the two systems all coefficients a_i,b_i must be zero.

The conclusion is that \dim L=l_1+l_2.

Jul 18

Is the inverse of a linear mapping linear?

Is the inverse of a linear mapping linear?

Orthonormal basis

Exercise 1. I) Let e_j denote unit vectors. They possess properties

(1) e_i\cdot e_j=0 for all i\neq j, \left\Vert e_{i}\right\Vert =1 for all i.

II) For any x\in R^n we have the representation

(2) x=\sum_jx_je_j.

III) In (2) the coefficients can be found as x_{j}=x\cdot e_{j}:

(3) x=\sum_j(x\cdot e_j)e_j.

Proof. I) (1) is proved by direct calculation. II) To prove (2) we write


III) If we have (2), then it's easy to see that by (1) x\cdot e_i=\sum_jx_j(e_j\cdot e_i)=x_i.

Definition 1. Any system of vectors that satisfies (1) is called an orthonormal system. An orthonormal system is called complete if for any x\in R^n we have the decomposition (3). Exercise 1 shows that our system of unit vectors is complete orthonormal. A complete orthonormal system is called an orthonormal basis.

Analyzing a linear mapping

Exercise 2. Let A be a matrix of size n\times k. Suppose you don't know the elements of A but you know the products (Ax)\cdot y for all x,y. How can you reveal the elements of A from (Ax)\cdot y? How do you express Ax using the elements you define?

Solution. Let us partition A into rows and suppose the elements a_{ij} are known. Let us try unit vectors as x,y:

(4) (Ae_j)\cdot e_i=\left(\begin{array}{c}A_1e_j \\... \\A_ne_j\end{array}\right)\cdot e_i=A_ie_j=a_{ij}.

Using (2) and (4) one can check that (Ax)\cdot e_i=A_ix=\sum_ja_{ij}x_j. Hence, from (3) we have the answer to the second question:

(5) Ax=\sum_i[\left(Ax\right)\cdot e_i]e_i=\sum_i\left(\sum_ja_{ij}x_j\right)e_i=\sum_{i,j}x_ja_{ij}e_i.

The above calculation means that when a_{ij} are unknown, we can define them by a_{ij}=(Ae_j)\cdot e_i and then the action of A on x will be described by the last expression in (5).

We know that a mapping generated by a matrix is linear. The converse is also true: a linear mapping is given by a matrix:

Exercise 3. Suppose a mapping f:R^k\rightarrow R^n is linear: f(ax+by)=af(x)+bf(y) for any numbers a,b and vectors x,y. Then there exists a matrix A of size n\times k such that f(x)=Ax for all x.

Proof. Based on (4) in our case put a_{ij}=f(e_j)\cdot e_i. Applying (3) to f(x) we get

(6) f(x)=\sum_i[f(x)\cdot e_i]e_i

(the summation index j is replaced by i on purpose). Plugging (2) in (6)

f(x)=\sum_i\left[f\left(\sum_jx_je_j\right)\cdot e_i\right]e_i= (both f and scalar product are linear)

=\sum_i\left[\left(\sum_jx_jf\left(e_j\right)\right)\cdot e_i\right]e_i=\sum_{i,j}x_j(f\left(e_j\right)\cdot e_i)e_i=\sum_{i,j}x_ja_{ij}e_i=Ax.

The last equation is the definition of A.

Exercise 4. An inverse of a linear mapping is linear (and given by a matrix by Exercise 3).

Proof. Let f(x)=Ax be a linear mapping and suppose its inverse f^{-1} in the general sense exists. Then f^{-1}(Ax)=x for all x. Let us put x=ay+bz for arbitrary numbers a,b and vectors y,z. Then we have f^{-1}(A(ay+bz))=ay+bz or, using linearity of A,


Putting Ay=u, Az=v we get


Thus, f^{-1} is linear.

Remark. In all of the above it is important that e_j are unit vectors. For a different basis, the results drastically change.