19
Feb 22

Distribution of the estimator of the error variance

Distribution of the estimator of the error variance

If you are reading the book by Dougherty: this post is about the distribution of the estimatorĀ  s^2 defined in Chapter 3.

Consider regression

(1) y=X\beta +e

where the deterministic matrix X is of size n\times k, satisfies \det  \left( X^{T}X\right) \neq 0 (regressors are not collinear) and the error e satisfies

(2) Ee=0,Var(e)=\sigma ^{2}I

\beta is estimated by \hat{\beta}=(X^{T}X)^{-1}X^{T}y. Denote P=X(X^{T}X)^{-1}X^{T}, Q=I-P. Using (1) we see that \hat{\beta}=\beta +(X^{T}X)^{-1}X^{T}e and the residual r\equiv y-X\hat{\beta}=Qe. \sigma^{2} is estimated by

(3) s^{2}=\left\Vert r\right\Vert ^{2}/\left( n-k\right) =\left\Vert  Qe\right\Vert ^{2}/\left( n-k\right) .

Q is a projector and has properties which are derived from those of P

(4) Q^{T}=Q, Q^{2}=Q.

If \lambda is an eigenvalue of Q, then multiplying Qx=\lambda x by Q and using the fact that x\neq 0 we get \lambda ^{2}=\lambda . Hence eigenvalues of Q can be only 0 or 1. The equation tr\left( Q\right) =n-k
tells usĀ that the number of eigenvalues equal to 1 is n-k and the remaining k are zeros. Let Q=U\Lambda U^{T} be the diagonal representation of Q. Here U is an orthogonal matrix,

(5) U^{T}U=I,

and \Lambda is a diagonal matrix with eigenvalues of Q on the main diagonal. We can assume that the first n-k numbers on the diagonal of Q are ones and the others are zeros.

Theorem. Let e be normal. 1) s^{2}\left( n-k\right) /\sigma ^{2} is distributed as \chi _{n-k}^{2}. 2) The estimators \hat{\beta} and s^{2} are independent.

Proof. 1) We have by (4)

(6) \left\Vert Qe\right\Vert ^{2}=\left( Qe\right) ^{T}Qe=\left(  Q^{T}Qe\right) ^{T}e=\left( Qe\right) ^{T}e=\left( U\Lambda U^{T}e\right)  ^{T}e=\left( \Lambda U^{T}e\right) ^{T}U^{T}e.

Denote S=U^{T}e. From (2) and (5)

ES=0, Var\left( S\right) =EU^{T}ee^{T}U=\sigma ^{2}U^{T}U=\sigma ^{2}I

and S is normal as a linear transformation of a normal vector. It follows that S=\sigma z where z is a standard normal vector with independent standard normal coordinates z_{1},...,z_{n}. Hence, (6) implies

(7) \left\Vert Qe\right\Vert ^{2}=\sigma ^{2}\left( \Lambda z\right)  ^{T}z=\sigma ^{2}\left( z_{1}^{2}+...+z_{n-k}^{2}\right) =\sigma ^{2}\chi  _{n-k}^{2}.

(3) and (7) prove the first statement.

2) First we note that the vectors Pe,Qe are independent. Since they are normal, their independence follows from

cov(Pe,Qe)=EPee^{T}Q^{T}=\sigma ^{2}PQ=0.

It's easy to see that X^{T}P=X^{T}. This allows us to show that \hat{\beta} is a function of Pe:

\hat{\beta}=\beta +(X^{T}X)^{-1}X^{T}e=\beta +(X^{T}X)^{-1}X^{T}Pe.

Independence of Pe,Qe leads to independence of their functions \hat{\beta} and s^{2}.