Proof. The proof is similar to the derivation of the Leibniz formula. Using the notation from that derivation, we decompose rows of into linear combinations of unit row-vectors . Hence Therefore by multilinearity in columns

Apart from being interesting in its own right, Exercise 1 allows one to translate properties in terms of rows to properties in terms of columns, as in the next corollary.

Corollary 1. if two columns of are linearly dependent.

Indeed, columns of are rows of its transpose, so Exercise 1 and Property III yield the result.

The axioms and properties of determinants established so far are asymmetric in that they are stated in terms of rows, while similar statements hold for columns. Among the most important properties is the fact that the determinant of is a multilinear function of its rows. Here we intend to establish multilinearity in columns.

Exercise 1. The determinant of is a multilinear function of its columns.

Proof. To prove linearity in each column, it suffices to prove additivity and homogeneity. Let be three matrices that have the same columns, except for the th column. The relationship between the th columns is specified by or in terms of elements for all Alternatively, in Leibniz' formula

(1)

for all such that we have

(2)

If for a given set there exists only one such that Therefore by (2) we can continue (1) as

Homogeneity is proved similarly, using equations or in terms of elements for all

If we had proved the multiplication rule for (2), we would have

(3)

In the theory of permutations (3) is proved without relying on the multiplication rule. I am going to use (3) as a shortcut that explains the idea. Combining (1) and (3) we obtain by the Leibniz formula

This is one of those cases when calculations explain the result. Let denote the th unit row-vector. Row obviously, can be decomposed as Recall that by Property V, the determinant of is a multilinear function of its rows. Using Property V times, we have

if among rows there are equal vectors. The remaining matrices with nonzero determinants are permutation matrices, so

(2)

Different-rows-different-columns rule. Take a good look at what the condition implies about the location of the factors of the product The rows to which the factors belong are obviously different. The columns are also different by the definition of a permutation matrix. Conversely, consider any combination of elements such that no two elements belong to the same row or column. Rearrange the first indices in an ascending order, from to This leads to a renumbering of the second indices. The product becomes Since the original second indices were all different, the new ones will be too. Hence, and such a term must be in (2).

Remark 1. (2) is the Leibniz formula. The formula at the right of (1) is the Levy-Civita formula. The difference is that the Levy-Civita formula contains many more zero terms, while in (2) a term can be zero only if .

Remark 2. Many textbooks instead of write in (2) signatures of permutations. Using is better because a) you save time by skipping the theory of permutations and b) you need a rule to calculate signatures of permutations and is such a rule (see an example).

As we know, changing places of two rows changes the sign of by -1. (2) tells us that permutation by changes the sign of by In the rigorous algebra course (2) is proved using the theory of permutations, without employing the multiplication rule for determinants.

I am going to call (2) a shortcut for permutations and use it without a proof. In general, I prefer to use such shortcuts, to see what is going on and bypass tedious proofs.

Other properties of permutation matrices

Exercise 1. Prove that the Definition 1 is equivalent to the following: A permutation matrix is defined by two conditions: a) all its columns are unit column-vectors and b) no two columns are equal.

Proof. Take the th column. It contains one unity (the one that comes from the th unit row-vector). It cannot contain more than one unity because all rows are different. Hence, the th column is a unit column-vector. Different columns are different unit vectors because otherwise some row would contain at least two unities and would not be a unit vector.

Exercise 2. Prove that a permutation matrix is an orthogonal matrix.

Proof. By Exercise 1 we can write a permutation matrix as a matrix of unit column-vectors:

Then

which proves orthogonality. It follows that (be careful with this equation, it follows from multiplicativity of determinants which we have not derived from our axioms).

Let denote the th unit row-vector (the th component is 1 and the others are zeros).

Definition 1. A permutation matrix is defined by two conditions: a) all of its rows are unit row-vectors and b) no two rows are equal. We use the notation

where all rows are different.

Example. In the following matrix we change places of rows in such a way as to obtain the identity in the end. Each transformation changes the sign of the matrix by Property VI.

Since there are three changes and the final matrix is identity, by Axiom 3 .

It is possible to arrive to the identity matrix in more than one way but the result will be the same because the number is fixed. We will not delve into the theory of permutations.

Exercise 1. Let be a linearly independent system of vectors in . Then it can be completed with a vector to form a basis in

Proof. One way to obtain is this. Let be a projector onto the span of and let Take as any nonzero vector from the image of It is orthogonal to any element of the image of and, in particular, to elements of Therefore completed with gives a linearly independent system . is a basis because for any

Property IV. Additivity. Suppose the th row of is a sum of two vectors:

Denote

,

(except for the th row, all others are the same for all three matrices). Then

Proof. Denote the system of vectors

Case 1. If is linearly dependent, then the system of all rows of is also linearly dependent. By Property III the determinants of all three matrices are zero and the statement is true.

Case 2. Let be linearly independent. Then by Exercise 1 it can be completed with a vector to form a basis in can be represented as linear combinations of elements of We are interested only in the coefficients of in those representations. So let where and are linear combinations of elements of Hence,

We can use Property II to eliminate and from the th rows of and respectively, without changing the determinants of those matrices. Let denote the matrix obtained by replacing the th row of with Then by Property II and Axiom 1

which proves the statement.

Combining homogeneity and additivity, we get the following important property that some people use as a definition:

Property V. Multilinearity. The determinant of is a multilinear function of its rows, that is, for each it is linear in row when the other rows are fixed.

Property VI. Antisymmetry. If the matrix is obtained from by changing places of two rows, then

Proof. Let

(all other rows of these matrices are the same). Consider the next sequence of transformations:

By Property II, each of these transformations preserves Recalling homogeneity, we finish the proof.

Axiom 1. Homogeneity. If denotes the matrix obtained from by multiplying one of its rows by a number then

This implies that for and are invertible simultaneously.

Axiom 2. If denotes the matrix obtained from by adding one of the rows of to another, then .

We remember that adding one of the rows of to another corresponds to adding one equation of the system to another and so this operation should not impact solvability of the system.

Axiom 3.

Adding this axiom on top of the previous two makes the determinant a unique function of a matrix.

Property II. If we add to one of the rows of another row multiplied by some number the determinant does not change.

Proof. If there is nothing to prove. Suppose and we want to add to This result can be achieved in three steps.

a) Multiply by By Axiom 1, gets multiplied by

b) In the new matrix, add row to By Axiom 2, this does not change the determinant.

c) In the resulting matrix, divide row numbered by The determinant gets divided by

The determinant of the very first matrix will be the same as that of the very last matrix, while the th row of the last matrix will be

Property III. If rows of are linearly dependent, then

Proof. Suppose rows of are linearly dependent. Then one of the rows can be expressed as a linear combination of the others. Suppose, for instance, that Multiply the th row by and add the result to the first row, for Thereby we make the first row equal to while maintaining the determinant the same by Property II. Then by Property I

Previously we looked at a motivating example to consider the determinant of a matrix

Using this basic example, now we formulate properties that uniquely define determinants of matrices of higher orders. The discussion is based mainly on Kurosh, Course in linear algebra, 9th edition, Moscow, 1968 (in Russian).

Observation 1. Homogeneity. If one of the rows of is multiplied by a number then gets multiplied by

Observation 2. Adding one of the rows of to another doesn't change the value of the determinant:

To see the intuition behind these rules, recall that the purpose of the determinant is to verify whether the system has solutions. Homogeneity means that if one of the equations of the system is multiplied by a nonzero constant, the solvability of the new system will be equivalent to the solvability of the original system. Similarly, if one of the equations of the system is added to another, solvability of the system as judged by the determinant will not change. This makes sense because multiplying a system equation by a nonzero constant or adding one equation to another does not change the information contained in the system.

Keep in mind an emerging general idea: the determinant discards any transformations of the matrix that are not relevant to its invertibility.

Observation 3. Determinant of the identity:

Taking these properties as axioms for matrices of higher order, we will show that they uniquely define determinants and develop a couple of rules involving determinants. One of them is the Leibniz formula for determinants and the other is Cramer's rule.

14. Define the OLS estimator of and prove its unbiasedness. Note the crucial role of the assumption Without it there is no unbiasedness and Gauss-Markov theorem.