## Variance of a vector: motivation and visualization

I always show my students the definition of the variance of a vector, and they usually don't pay attention. You need to know what it is, already at the level of simple regression (to understand the derivation of the slope estimator variance), and even more so when you deal with time series. Since I know exactly where students usually stumble, this post is structured as a series of questions and answers.

## Think about ideas: how would you define variance of a vector?

**Question 1**. We know that for a random variable , its variance is defined by

(1)

Now let

be a vector with components, each of which is a random variable. How would you define its variance?

The answer is not straightforward because we don't know how to square a vector. Let denote the **transposed vector**. There are two ways to multiply a vector by itself: and

**Question 2**. Find the dimensions of and and their expressions in terms of coordinates of

**Answer 2**. For a product of matrices there is a **compatibility rule** that I write in the form

(2)

Recall that in the notation means that the matrix has rows and columns. For example, is of size Verbally, the above rule says that the number of columns of should be equal to the number of rows of In the product that common number disappears and the unique numbers ( and ) give, respectively, the number of rows and columns of Isn't the the formula

easier to remember than the verbal statement? From (2) we see that is of dimension 1 (it is a scalar) and is an matrix.

For actual **multiplication of matrices** I use the visualization

(3)

**Short formulation**. Multiply rows from the first matrix by columns from the second one.

**Long Formulation**. To find the element of we find a **scalar product** of the th row of and th column of To find all elements in the th row of we fix the th row in and move right the columns in Alternatively, to find all elements in the th column of we fix the th column in and move down the rows in . Using this rule, we have

(4)

Usually students have problems with the second equation.

Based on (1) and (4), we have two candidates to define variance:

(5)

and

(6)

**Answer 1**. The second definition contains more information, in the sense to be explained below, so we define **variance of a vector** by (6).

**Question 3**. Find the elements of this matrix.

**Answer 3**. Variance of a vector has variances of its components on the main diagonal and covariances outside it:

(7)

If you can't get this on your own, go back to Answer 2.

There is a matrix operation called **trace** and denoted . It is defined only for square matrices and gives the sum of diagonal elements of a matrix.

**Exercise 1**. Show that In this sense definition (6) is more informative than (5).

**Exercise 2**. Show that if , then (7) becomes