7
May 18

## Variance of a vector: motivation and visualization

I always show my students the definition of the variance of a vector, and they usually don't pay attention. You need to know what it is, already at the level of simple regression (to understand the derivation of the slope estimator variance), and even more so when you deal with time series. Since I know exactly where students usually stumble, this post is structured as a series of questions and answers.

## Think about ideas: how would you define variance of a vector?

Question 1. We know that for a random variable $X$, its variance is defined by

(1) $V(X)=E(X-EX)^{2}.$

Now let $X=\left(\begin{array}{c}X_{1} \\... \\X_{n}\end{array}\right)$

be a vector with $n$ components, each of which is a random variable. How would you define its variance?

The answer is not straightforward because we don't know how to square a vector. Let $X^T=(\begin{array}{ccc}X_1& ...&X_n\end{array})$ denote the transposed vector. There are two ways to multiply a vector by itself: $X^TX$ and $XX^T.$

Question 2. Find the dimensions of $X^TX$ and $XX^T$ and their expressions in terms of coordinates of $X.$

Answer 2. For a product of matrices there is a compatibility rule that I write in the form

(2) $A_{n\times m}B_{m\times k}=C_{n\times k}.$

Recall that $n\times m$ in the notation $A_{n\times m}$ means that the matrix $A$ has $n$ rows and $m$ columns. For example, $X$ is of size $n\times 1.$ Verbally, the above rule says that the number of columns of $A$ should be equal to the number of rows of $B.$ In the product that common number $m$ disappears and the unique numbers ( $n$ and $k$) give, respectively, the number of rows and columns of $C.$ Isn't the the formula
easier to remember than the verbal statement? From (2) we see that $X_{1\times n}^TX_{n\times 1}$ is of dimension 1 (it is a scalar) and $X_{n\times 1}X_{1\times n}^T$ is an $n\times n$ matrix.

For actual multiplication of matrices I use the visualization

(3) $\left(\begin{array}{ccccc}&&&&\\&&&&\\a_{i1}&a_{i2}&...&a_{i,m-1}&a_{im}\\&&&&\\&&&&\end{array}\right) \left(\begin{array}{ccccc}&&b_{1j}&&\\&&b_{2j}&&\\&&...&&\\&&b_{m-1,j}&&\\&&b_{mj}&&\end{array}\right) =\left( \begin{array}{ccccc}&&&&\\&&&&\\&&c_{ij}&&\\&&&&\\&&&&\end{array}\right)$

Short formulation. Multiply rows from the first matrix by columns from the second one.

Long Formulation. To find the element $c_{ij}$ of $C,$ we find a scalar product of the $i$th row of $A$ and $j$th column of $B:$ $c_{ij}=a_{i1}b_{1j}+a_{i2}b_{2j}+...$ To find all elements in the $i$th row of $C,$ we fix the $i$th row in $A$ and move right the columns in $B.$ Alternatively, to find all elements in the $j$th column of $C,$ we fix the $j$th column in $B$ and move down the rows in $A$. Using this rule, we have

(4) $X^TX=X_1^2+...+X_n^2,$ $XX^T=\left(\begin{array}{ccc}X_1^2&...&X_1X_n \\...&...&... \\X_nX_1&...&X_n^2 \end{array}\right).$

Usually students have problems with the second equation.

Based on (1) and (4), we have two candidates to define variance:

(5) $V(X)=E(X-EX)^T(X-EX)$

and

(6) $V(X)=E(X-EX)(X-EX)^T.$

Answer 1. The second definition contains more information, in the sense to be explained below, so we define variance of a vector by (6).

Question 3. Find the elements of this matrix.

Answer 3. Variance of a vector has variances of its components on the main diagonal and covariances outside it:

(7) $V(X)=\left(\begin{array}{cccc}V(X_1)&Cov(X_1,X_2)&...&Cov(X_1,X_n) \\Cov(X_2,X_1)&V(X_2)&...&Cov(X_2,X_n) \\...&...&...&... \\Cov(X_n,X_1)&Cov(X_n,X_2)&...&V(X_n) \end{array}\right).$

If you can't get this on your own, go back to Answer 2.

There is a matrix operation called trace and denoted $tr$. It is defined only for square matrices and gives the sum of diagonal elements of a matrix.

Exercise 1. Show that $tr(V(X))=E(X-EX)^T(X-EX).$ In this sense definition (6) is more informative than (5).

Exercise 2. Show that if $EX_1=...=EX_n=0$, then (7) becomes $V(X)=\left(\begin{array}{cccc}EX^2_1&EX_1X_2&...&EX_1X_n \\EX_2X_1&EX^2_2&...&EX_2X_n \\...&...&...&... \\EX_nX_1&EX_nX_2&...&EX^2_n \end{array}\right).$