Jul 19

Correctness of the space dimension definition

Correctness of the space dimension definition

This proof has been taken from I. M. Gelfand, Lectures in Linear Algebra, 4th edition, 1970 (in Russian).

Lemma. Let f_1,...,f_k be some vectors. Suppose that vectors g_1,...,g_l are linearly independent and belong to the span \text{span}(f_{1},...,f_{k}). Then l\leq k.

Proof. We prove the lemma by induction. Let k=1. Then l=1 (linear dependence for an empty set of vectors is not defined) and the inequality l\leq k is trivial.

Suppose the statement holds for k-1 and let us prove it for k. Since g_{1},...,g_{l}\in \text{span}(f_{1},...,f_{k}), we have

(1) g_{1}=a_{11}f_{1}+...+a_{1k}f_{k}\\...\\  g_{l}=a_{l1}f_{1}+...+a_{lk}f_{k}.

If all of a_{1k},...,a_{lk} are zero, we have g_{1},...,g_{l}\in \text{span}(f_{1},...,f_{k-1}) and then by the induction assumption l\leq k-1 and, trivially, l\leq k.

Thus, we can assume that not all of a_{1k},...,a_{lk} are zero. Suppose a_{lk}\neq 0. Then we can solve the last equation in (1) for f_{k}:

(2) f_{k}=\frac{1}{a_{lk}}g_{l}-\frac{1}{a_{lk}}\sum_{j=1}^{k-1}a_{l,j}f_{j}.

Plugging this equation in the first equation in (1) we get

g_{1}=a_{11}f_{1}+...+a_{1,k-1}f_{k-1}+a_{1k}\left( \frac{1}{a_{lk}}g_{l}-\frac{1}{a_{lk}}\sum_{j=1}^{k-1}a_{l,j}f_{j}\right) .

Send g_{l} to the left side; the remaining expression on the right side is a linear combination of f_{1},...,f_{k-1}; exact expressions of the coefficients b_{1,j} of this linear combination don't matter. The result will be g_{1}-\frac{a_{1k}}{a_{lk}}g_{l}=\sum_{j=1}^{k-1}b_{1,j}f_{j}. After doing the same with the first l-1 equations of (1) we get the system

g_{1}-\frac{a_{1k}}{a_{lk}}g_{l}=\sum_{j=1}^{k-1}b_{1,j}f_{j},...,\ \  g_{l-1}-\frac{a_{l-1,k}}{a_{lk}}g_{l}=\sum_{j=1}^{k-1}b_{l-1,j}f_{j}.

This shows that the vectors

g_{1}^{\prime }=g_{1}-\frac{a_{1k}}{a_{lk}}g_{l},...,g_{l-1}^{\prime }=\  g_{l-1}-\frac{a_{l-1,k}}{a_{lk}}g_{l}

belong to \text{span}(f_{1},...,f_{k-1}). If they are linearly independent, we can use the induction assumption to conclude that l-1\leq k-1, which will prove l\leq k.

Suppose with some c_{i}


By the assumed linear independence of g_{1},...,g_{l} this implies c_{1}=...=c_{l-1}=0, so the system g_{1}^{\prime },...,g_{l-1}^{\prime } is linearly independent.

Theorem. The definition of the space dimension is correct.

Proof. We write \dim (L)=n if a) L contains n linearly independent vectors x_{1},...,x_{n} and b) L=\text{span}(x_{1},...,x_{n}). We need to prove that any system with properties a) and b) has the same number of vectors. Suppose y_{1},...,y_{m} is another such system. Since y_{1},...,y_{m} belong to \text{span}(x_{1},...,x_{n}), by the lemma m\leq n. Similarly, n\leq m. So n=m.

Feb 19

Determinants: questions for repetition

Determinants: questions for repetition

All formulas for calculating determinants are complex. There are two ways to study determinants: one is to start with the explicit expressions (the Leibniz formula or Laplace expansion) and the other is to start with simple functional properties (called Axioms 1-3 here), then develop more advanced ones (multilinearity) and, finally, derive explicit formulas. I prefer to see ideas and follow the second way.

  1. Formulate Axioms 1-3 and make sure that you understand the motivation: 1) multiplying one of the equations of the system ax+by=e, cx+dy=f by a nonzero number k does not impact solvability of the system, 2) similarly, adding one equation of the system to another does not affect solvability and 3) the system x=e, y=f is trivially solvable.

  2. Interpret in terms of system solvability properties I-III and prove them.

  3. Prove that the determinant is a multilinear antisymmetric function of rows.

  4. Define a permutation matrix and give an example of calculating its determinant using Axioms 1-3 and Properties I-VI.

  5. Show what pre-multiplication by a permutation matrix does to a matrix A.

  6. Prove that a permutation matrix is an orthogonal matrix.

  7. Derive the Leibniz formula.  At this point, the way multilinearity (in rows) and permutation matrices are used should be absolutely obvious. If they are not, start from Question 1.

  8. Explain the different-rows-different-columns (cross-out) rule.

  9. Prove multiplicativity of determinants.

  10. Prove multilinearity in columns.

  11. Prove that transposition does not change determinants.

  12. Let A^{(j)} be the j-th column of A and define the row-vector L_j by \det A=L_jA^{(j)}. Why is this definition correct? Prove that L_jA^{(k)}=0 for any k\neq j.

  13. Using Question 12, derive Cramer's rule for solving the system Ax=y.

  14. Using Question 12, prove the invertibility criterion.

  15. Derive the Laplace expansion.

  16. Using Questions 14 and 15, derive the explicit formula for A^{-1}.

Feb 19

Laplace expansion

Laplace expansion

The Laplace expansion is a formula for calculating determinants that is calculationally more efficient than the Leibniz formula but less efficient than the decomposition into triangular matrices. See if you like the proof below or the one that heavily relies on permutations.

In fact, we have already dealt with the Laplace expansion when we said that \det A is a linear function of column A^{(j)}:

(1) \det A=\sum_{i=1}^na_{ij}l_{ji}.

Our task is to analyze the coefficients l_{ji}. They will be identified as determinants of certain submatrices of A, up to the sign.

Special case

For simplicity, let us start with j=1:

(2) \det A=a_{11}l_{11}+...+a_{n1}l_{1n}.

Definition. For any a_{ij}, denote A_{ij} the matrix resulting from A after deleting row i and column j. A_{ij} is of size (n-1)\times (n-1). Its determinant \det A_{ij}=M_{ij} is called (i,j)-minor.

Step 1. We want to show that l_{11}=M_{11}. By the cross-out rule l_{11} depends only on the elements of A_{11}. When studying l_{11}, we can assume without loss of generality that

a_{11}=1, a_{21}=...=a_{n1}=0.

Then (2) becomes \det A=l_{11}. This allows us to assert that l_{11} satisfies Axioms 1-3:

1) l_{11} is a homogeneous function of any row of A_{11} because \det A has this property,

2) Adding one of the rows of A_{11} to another row does not change l_{11} because \det A stays the same, and

3) If A_{11}=I_{(n-1)\times (n-1)}, then A=I_{n\times n} and l_{11}=\det I=1.

Since, again, l_{11} is a function of elements of A_{11} only, it follows that

(3) l_{11}=\det A_{11}=M_{11}.

Step 2. To analyze l_{12}, as above, we can assume that

a_{21}=1, a_{11}=a_{31}=...=a_{n1}=0.

(2) becomes \det A=l_{12}. Here A=(u_2,A^{(2)},...,A^{(n)}) where u_2 is the second unit column-vector and A^{(2)},...,A^{(n)} are columns of A. Let \tilde{A} be the result of permuting the first and second rows of A. This permutation changes the sign of \det A and does not change l_{12} (the matrix A_{12} for A is the same as A_{11} for \tilde{A}). Hence,

(4) \det \tilde{A}=-\det A=-l_{12}.

For \tilde{A}, l_{11} is the same as l_{11} for A in Step 1, so (3) and (4) imply l_{12}=-M_{12}.

Step i. We can assume that

a_{i1}=1, a_{11}=...=a_{i-1,1}=a_{i+1,1}=a_{n1}=0.

(2) becomes \det A=l_{1i}, where A=(u_i,A^{(2)},...,A^{(n)}) and u_{i} is the i-th unit column-vector. As in Step 2, we can reduce this case to Step 1. Permute rows i and i-1, then permute rows i-1 and i-2 and so on. In total we need i-1 permutations leading to i-1 changes in sign. Instead of (4) we get

\det \tilde{A}=(-1)^{i-1}\det A=(-1)^{i-1}l_{1i},

where \tilde{A} is the result of i-1 permutations. The element in the upper left corner of \tilde{A} is 1 and therefore l_{1i} is the same as l_{11} for A in Step 1. The conclusion is that

(5) l_{1i}=(-1)^{i-1}M_{1i}.

General case

To consider (1), we reduce the case of the j-th column to the case of the first column. For this, we permute columns j and j-1, then columns j-1 and j-2 and so on. In total we need j-1 permutations. Denoting the new matrix \tilde{A}, we have \det \tilde{A}=(-1)^{j-1}\det A. To \det \tilde{A} we can apply what we know about (2). This will lead to multiplying (5) by (-1)^{j-1}. The result is


Thus we have derived the Laplace expansion:

Theorem. For j=1,...,n one has an expansion by column

\det A=\sum_{i=1}^na_{ij}(-1)^{i+j}M_{ji}.

The meaning of this expansion is that the calculation of the determinant of A is reduced to the calculation of the determinants of matrices of lower dimension. Instead of expansions by column, one can use expansions by row, whichever is convenient. The expression (-1)^{i+j}M_{ji} is called a cofactor of the element a_{ij}.

Feb 19

Cramer's rule and invertibility criterion

Cramer's rule and invertibility criterion

Consequences of multilinearity

For a fixed j, \det A is a linear function of column A^{(j)}. Such a linear function generates a row-vector L_{j} by way of a formula (see Exercise 3)

(1) \det A=L_jA^{(j)}.

Exercise 1. In addition to (1), we have

(2) L_jA^{(k)}=0 for any k\neq l.

Proof. Here and in the future it is useful to introduce the coordinate representation for L_j=(l_{j1},...,l_{jn}) and put

(3) L=\left(\begin{array}{c}L_1\\...\\L_n\end{array}\right).

Then we can write (1) as \det A=\sum_{i=1}^nl_{ji}a_{ij}. Here the element l_{ji} does not involve a_{ij} and therefore by the different-columns-different-rows rule it does not involve elements of the entire column A^{(j)}. Hence, the vector L_j does not involve elements of the column A^{(j)}.

Let A^{\prime } denote the matrix obtained from A by replacing column A^{(j)} with column A^{(k)}. The vector L_j for the matrix A^{\prime } is the same as for A because both vectors depend on the elements from columns other than the column numbered j. Since A^\prime contains linearly dependent (actually two identical) columns, \det A^\prime=0. Using in (1) A^\prime instead of A we get 0=\det A^\prime=L_jA^{(k)}, as required.

After reading the next two sections, come back and read this statement again to appreciate its power and originality.

Cramer's rule

Exercise 2. Suppose \det A\neq 0. For any y\in R^n denote B_j the matrix formed by replacing the j-th column of A by the column vector y. Then the solution of the system Ax=y exists and the components of x are given by

x_j=\frac{\det B_j}{\det A},\ j=1,...,n.

Proof. Premultiply Ax=y by L_{j}:

(4) L_jy=L_jAx=\left(L_jA^{(1)},...,L_jA^{(n)}\right)x=(0,...,\det A,...,0)x.

Here we applied (1) and (2) (the j-th component of the vector (0,...,\det  A,...,0) is \det A and all others are zeros). From (4) it follows that (\det  A)x_{j}=L_{j}y. On the other hand, from (1) we have \det B_j=L_jy (the vector L_j for B_j is the same as for A, see the proof of Exercise 1). The last two equations prove the statement.

Invertibility criterion

Exercise 3. A is invertible if and only if \det A\neq 0.

Proof. If A is invertible, then AA^{-1}=I. By multiplicativity of determinant and Axiom 3 this implies \det A\det (A^{-1})=1. Thus, \det  A\neq 0.

Conversely, suppose \det A\neq 0. (1), (2) and (3) imply

LA=\left(\begin{array}{c}L_1\\...\\L_n\end{array}\right) (A^{(1)},...,A^{(n)})=\left(\begin{array}{ccc}L_1A^{(1)}&...&L_1A^{(n)}\\...&...&...\\L_nA^{(1)}&...&L_nA^{(n)}\end{array}  \right)

=\left(\begin{array}{ccc}\det A&...&0\\...&...&...\\0&...&\det A\end{array}  \right)=\det A\times I.

This means that the matrix \frac{1}{\det A}L is the inverse of A. Recall that existence of the left inverse implies that of the right inverse, so A is invertible.

Feb 19

Determinant of a transpose

Determinant of a transpose

Exercise 1. \det (A^T)=\det A.

Proof. The proof is similar to the derivation of the Leibniz formula. Using the notation from that derivation, we decompose rows of A into linear combinations of unit row-vectors A_i=\sum_{j=1}^na_{ij}e_j. Hence A_i^T=\sum_{j=1}^na_{ij}e_j^T. Therefore by multilinearity in columns

(1) \det (A^{T})=\sum_{j_{1},...,j_{n}=1}^{n}a_{1j_{1}}...a_{nj_{n}}\det  \left( e_{j_{1}}^{T},...,e_{j_{n}}^{T}\right)

=\sum_{j_{1},...,j_{n}:\det  P_{j_{1},...,j_{n}}^{T}\neq 0}a_{1j_{1}}...a_{nj_{n}}\det  P_{j_{1},...,j_{n}}^{T}.

Now we want to relate \det P_{j_1,...,j_n}^T to \det P_{j_1,...,j_n}.

(2) \det P_{j_1,...,j_n}^T=\det (P_{j_1,...,j_n}^{-1}) (the transpose of an orthogonal matrix equals its inverse)

=\frac{1}{\det (P_{j_1,...,j_n})}          (the determinant of the inverse is the inverse of the determinant)

=\det (P_{j_1,...,j_n})      (because P_{j_1,...,j_n} is orthogonal and its determinant is either 1 or -1 ).

(1) and (2) prove the statement.

Apart from being interesting in its own right, Exercise 1 allows one to translate properties in terms of rows to properties in terms of columns, as in the next corollary.

Corollary 1. \det A=0 if two columns of A are linearly dependent.

Indeed, columns of A are rows of its transpose, so Exercise 1 and Property III yield the result.

Feb 19

Multilinearity in columns

Multilinearity in columns

The axioms and properties of determinants established so far are asymmetric in that they are stated in terms of rows, while similar statements hold for columns. Among the most important properties is the fact that the determinant of A is a multilinear function of its rows.  Here we intend to establish multilinearity in columns.

Exercise 1. The determinant of A is a multilinear function of its columns.

Proof. To prove linearity in each column, it suffices to prove additivity and homogeneity. Let A,B,C be three matrices that have the same columns, except for the lth column. The relationship between the lth columns is specified by A^{(l)}=B^{(l)}+C^{(l)} or in terms of elements a_{il}=b_{il}+c_{il} for all i=1,...,n. Alternatively, in Leibniz' formula

(1) \det A=\sum_{j_{1},...,j_{n}:\det P_{j_{1},...,j_{n}}\neq  0}a_{1j_{1}}...a_{nj_{n}}\det P_{j_{1},...,j_{n}}

for all i such that j_i=l we have

(2) a_{ij_i}=b_{ij_i}+c_{ij_i}.

If \det P_{j_1,...,j_n}\neq 0, for a given set j_1,...,j_n there exists only one i such that j_i=l. Therefore by (2) we can continue (1) as

\det A=\sum_{j_{1},...,j_{n}:\det P_{j_{1},...,j_{n}}\neq  0}a_{1j_{1}}...(b_{ij_{i}}+c_{ij_{i}})a_{nj_{n}}\det  P_{j_{1},...,j_{n}}=\det B+\det C.

Homogeneity is proved similarly, using equations A^{(l)}=kB^{(l)} or in terms of elements a_{il}=kb_{il} for all i.

Feb 19

Determinant of a product

Determinant of a product

The derivation is very similar to the proof of the Leibniz rule.

Consider matrices A,B and let B_j denote the jth row of B. Row C_i of the product C=AB, obviously, can be written as C_i=\sum_{j=1}^na_{ij}B_j. Using Property V n times, we have

(1) \det C=\det \left(\begin{array}{c}\sum_{j_1=1}^na_{1j_1}B_{j_1}\\C_2\\...\\C_n\end{array}\right) =\sum_{j_1=1}^na_{1j_1}\det \left(\begin{array}{c}B_{j_1}\\C_2\\...\\C_n\end{array}\right)

=\sum_{j_1=1}^na_{1j_1}\det \left(  \begin{array}{c}B_{j_1}\\\sum_{j_2=1}^na_{2j_2}B_{j_2}\\...\\C_n\end{array}\right)=\sum_{j_1,j_2=1}^na_{1j_1}a_{2j_2}\det\left(\begin{array}{c}B_{j_1}\\B_{j_2}\\...\\  C_{n}\end{array}\right)

=...=\sum_{j_1,...,j_n=1}^na_{1j_1}...a_{nj_n}\det\left(\begin{array}{c}  B_{j_1}\\B_{j_2}\\...\\B_{j_n}\end{array}\right).

By analogy with permutation matrices denote


Recall the link between B_{j_1,...,j_n} and B:

(2) B_{j_1,...,j_n}=P_{j_1,...,j_n}B.

If we had proved the multiplication rule for (2), we would have

(3) \det B_{j_1,...,j_n}=(\det P_{j_1,...,j_n})(\det B).

In the theory of permutations (3) is proved without relying on the multiplication rule. I am going to use (3) as a shortcut that explains the idea. Combining (1) and (3) we obtain by the Leibniz formula

\det C=\sum_{j_1,...,j_n:\det P_{j_1,...,j_n}\neq 0}a_{1j_1}...a_{nj_n}(\det P_{j_1,...,j_n})(\det B)=(\det A)(\det B).
Feb 19

Leibniz formula for determinants

Rule for calculating determinant

This is one of those cases when calculations explain the result.

Let e_j denote the jth unit row-vector. Row A_i, obviously, can be decomposed as A_i=\sum_{j=1}^na_{ij}e_j. Recall that by Property V, the determinant of A is a multilinear function of its rows. Using Property V n times, we have

(1) \det A=\det \left(\begin{array}{c}\sum_{j_1=1}^na_{1j_1}e_{j_1}\\A_2\\...\\  A_n\end{array}\right)=\sum_{j_1=1}^na_{1j_1}\det\left(\begin{array}{c}e_{j_1}\\A_2\\...\\  A_{n}\end{array}\right)

=\sum_{j_1=1}^na_{1j_1}\det \left(\begin{array}{c}e_{j_1}\\\sum_{j_2=1}^na_{2j_2}e_{j_2}\\...\\A_n\end{array}\right)=\sum_{j_1,j_2=1}^na_{1j_1}a_{2j_2}\det\left(\begin{array}{c}e_{j_1}\\e_{j_2}\\...\\  A_{n}\end{array}\right)=...

=\sum_{j_1,...,j_n=1}^na_{1j_1}...a_{nj_n}\det\left(\begin{array}{c}  e_{j_1}\\e_{j_2}\\...\\e_{j_n}\end{array}\right).

Here by Property III


if among rows there are equal vectors. The remaining matrices with nonzero determinants are permutation matrices, so

(2) \det A=\sum_{j_1,...,j_n:\det P_{j_1,...,j_n}\neq 0}a_{1j_1}...a_{nj_n}\det P_{j_1,...,j_n}.

Different-rows-different-columns rule. Take a good look at what the condition \det P_{j_1,...,j_n}\neq 0 implies about the location of the factors of the product a_{1j_1}...a_{nj_n}. The rows 1,...,n to which the factors belong are obviously different. The columns j_1,...,j_n are also different by the definition of a permutation matrix. Conversely, consider any combination a_{i_1,j_1},...,a_{i_n,j_n} of elements such that no two elements belong to the same row or column. Rearrange the first indices i_1,...,i_n in an ascending order, from 1 to n. This leads to a renumbering \tilde{j}_{1},...,\tilde{j}_{n} of the second indices. The product a_{i_1,j_1}...a_{i_n,j_n} becomes a_{1,\tilde{j}_1}...a_{n,\tilde{j}_n}. Since the original second indices were all different, the new ones will be too. Hence, P_{\tilde{j}_1,...,\tilde{j}_n}\neq 0 and such a term must be in (2).

This rule alternatively can be called a cross-out rule because in practice it is used like this: take, say, element a_{11} and cross out the row and column it belongs to. The next factor is selected from the remaining matrix. In that matrix, again cross out the row and column the second factor sits in. Continue like this until you have n factors. Multiply the resulting product by the determinant of an appropriate permutation matrix, and you have one term of the Leibniz formula.

Remark 1. (2) is the Leibniz formula. The formula at the right of (1) is the Levy-Civita formula. The difference is that the Levy-Civita formula contains many more zero terms, while in (2) a term can be zero only if a_{1j_1}...a_{nj_n}=0.

Remark 2. Many textbooks instead of \det P_{j_1,...,j_n} write in (2) signatures of permutations. Using \det P_{j_1,...,j_n} is better because a) you save time by skipping the theory of permutations and b) you need a rule to calculate signatures of permutations and \det P_{j_1,...,j_n} is such a rule (see an example).

Feb 19

Properties of permutation matrices

Properties of permutation matrices

Shortcut for permutations

A permutation matrix permutes (changes orders of) rows of a matrix. The precise meaning of this statement is given in equation (1) below.

Partitioning the matrix A into rows A_i we have


Stacking these equations we get

P_{j_1,...,j_n}A=\left(\begin{array}{c}e_{j_1}\\...\\e_{j_n}\end{array}\right) A=\left(\begin{array}{c}e_{j_1}A\\...\\e_{j_n}A\end{array}\right) =\left(\begin{array}{c}A_{j_1}\\...\\A_{j_n}\end{array}\right).

By analogy with P_{j_1,...,j_n} we denote the last matrix


Thus, pre-multiplication by P_{j_1,...,j_n} transforms A to A_{j_1,...,j_n}:

(1) P_{j_1,...,j_n}A=A_{j_1,...,j_n}.

If we had proven the multiplication rule for determinants, we could have concluded from (1) that

(2) \det (P_{j_1,...,j_n})\det A=\det A_{j_1,...,j_n}.

As we know, changing places of two rows changes the sign of \det A by -1. (2) tells us that permutation by P_{j_1,...,j_n} changes the sign of \det A by \det (P_{j_1,...,j_n}). In the rigorous algebra course (2) is proved using the theory of permutations, without employing the multiplication rule for determinants.

I am going to call (2) a shortcut for permutations and use it without a proof. In general, I prefer to use such shortcuts, to see what is going on and bypass tedious proofs.

Other properties of permutation matrices

Exercise 1. Prove that Definition 1 is equivalent to the following: A permutation matrix P is defined by two conditions: a) all its columns are unit column-vectors and b) no two columns are equal.

Proof. Take the jth column. It contains one unity (the one that comes from the jth unit row-vector). It cannot contain more than one unity because all rows are different. Hence, the jth column is a unit column-vector. Different columns are different unit vectors because otherwise some row would contain at least two unities and would not be a unit vector.

Exercise 2. Prove that a permutation matrix is an orthogonal matrix.

Proof. By Exercise 1 we can write a permutation matrix as a matrix of unit column-vectors:

P=\left( u_{j_1},...,u_{j_n}\right). Then

P^TP=\left(\begin{array}{c}u_{j_1}^T\\...\\u_{j_n}^T\end{array}\right)\left(u_{j_1},...,u_{j_n}\right) =\left(\begin{array}{ccc}u_{j_1}^Tu_{j_1}&...&u_{j_1}^Tu_{j_n}\\...&...&...\\u_{j_n}^Tu_{j_1}&...&u_{j_n}^Tu_{j_n}  \end{array}\right)=I,

which proves orthogonality. It follows that \det P=\pm 1 (be careful with this equation, it follows from multiplicativity of determinants which we have not derived from our axioms).

Feb 19

Permutation matrices

Permutation matrices

Definition and example

Let e_i=(0,...,0,1,0,...0) denote the ith unit row-vector (the ith component is 1 and the others are zeros).

Definition 1. A permutation matrix P is defined by two conditions: a) all of its rows are unit row-vectors and b) no two rows are equal. We use the notation


where all rows are different.

Example. In the following matrix we change places of rows in such a way as to obtain the identity in the end. Each transformation changes the sign of the matrix by Property VI.

P_{2,4,1,3}=\left(\begin{array}{c}e_2\\e_4\\e_1\\e_3\end{array}\right) =\left(\begin{array}{cccc}0&1&0&0\\0&0&0&1\\1&0&0&0\\0&0&1&0\end{array}\right)\overset{\text{changing 1st and 3rd rows}}{\rightarrow }\left(\begin{array}{cccc}1 &0&0&0\\0&0&0&1\\0&1&0&0\\0&0&1&0\end{array}\right)

\overset{\text{changing 3rd and 4th rows}}{\rightarrow }\left(\begin{array}{cccc}1&0&0&0\\0&0&0&1\\0&0&1&0\\0&1&0&0  \end{array}\right)\overset{\text{changing 2nd and 4th rows}}{\rightarrow }\left(\begin{array}{cccc}1&0&0&0\\  0&1&0&0\\0&0&1&0\\0&0&0&1\end{array}\right).

Since there are three changes and the final matrix is identity, by Axiom 3 \det P_{2,4,1,3}=-1.

It is possible to arrive to the identity matrix in more than one way but the result will be the same because the number \det P_{2,4,1,3} is fixed. Calculating determinants of permutation matrices in this way allows us to avoid studying the theory of permutations.