19
Jul 18

Geometry of linear equations: matrix as a mapping

Geometry of linear equations: matrix as a mapping

General idea. Let f(x) be a function with the domain D(f). We would like to know for which y the equation f(x)=y has solutions, and if it does, then how many (one or more). For the existence part, let the argument x run over D(f) and see what set the values f(x) belong to. It is the image \text{Img}(f)=\{ f(x):x\in D(f)\}. This definition directly implies

Basic observation 1. The equation f(x)=y has solutions if and only if y\in\text{Img}(f).

For the uniqueness part, fix y\in\text{Img}(f) and see for which arguments x\in D(f) the value f(x) equals that y. This is the counter-image f^{-1}(y)=\{ x\in D(f):f(x)=y\} of y.

Basic observation 2. If f^{-1}(y) consists of one point, you have uniqueness.

See how this works for the function f(x)=0, x\leq 0, f(x)=x, x>0. This function is not linear. The function generated by a matrix is linear, and a lot more can be said in addition to the above observations.

Matrix as a mapping

Definition 1. Let A be a matrix of size n\times k. It generates a mapping f:R^k\rightarrow R^n according to f(x)=Ax (x is written as a k\times 1 column). Following the common practice, we identify the mapping f with the matrix A.

Exercise 1 (first characterization of matrix image) Show that the image \text{Img}(A) consists of linear combinations of the columns of A.

Solution. Partitioning A into columns, for any x we have

(1) Ax=(A^1,...,A^k)\left(\begin{array}{c}x_1 \\ ... \\ x_k\end{array}\right)=x_1A^1+...+x_kA^k.

This means that when x runs over D(A)=R^k, the images Ax are linear combinations of the column-vectors A^1,...,A^k.

Exercise 2. The mapping from Definition 1 is linear: for any vectors x,y and numbers a,b one has

(2) A(ax+by)=aAx+bAy.

Proof. By (1)

A(ax+by)=(ax+by)_1A^1+...+(ax+by)_kA^k =(ax_1A^1+...+ax_kA^k)+(by_1A^1+...+by_kA^k) =a(x_1A^1+...+x_kA^k)+b(y_1A^1+...+y_kA^k)=aAx+bAy.

Remark. In (1) we silently used the multiplication rule for partitioned matrices. Here is the statement of the rule in a simple situation. Let A,B be two matrices compatible for multiplication. Let us partition them into smaller matrices

A=\left(\begin{array}{cc}A_{11}&A_{12}\\A_{21}&A_{22}\end{array}\right),  B=\left(\begin{array}{cc}B_{11}&B_{12}\\B_{21}&B_{22}\end{array}\right).

Then the product AB can be found as if those blocks were numbers:

AB=\left(\begin{array}{cc}A_{11}&A_{12}\\A_{21}&A_{22}\end{array}\right)\left(\begin{array}{cc}B_{11}&B_{12}\\B_{21}&B_{22}\end{array}\right)=\left(\begin{array}{cc}A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\end{array}\right).

The only requirement for this to be true is that the blocks A_{ij}, B_{ij} be compatible for multiplication. You will not be bored with the proof. In (1) the multiplication is performed as if A^1,...,A^k were numbers.

Leave a Reply

You must be logged in to post a comment.