19
Feb 22

## Distribution of the estimator of the error variance

If you are reading the book by Dougherty: this post is about the distribution of the estimator  $s^2$ defined in Chapter 3.

Consider regression

(1) $y=X\beta +e$

where the deterministic matrix $X$ is of size $n\times k,$ satisfies $\det \left( X^{T}X\right) \neq 0$ (regressors are not collinear) and the error $e$ satisfies

(2) $Ee=0,Var(e)=\sigma ^{2}I$

$\beta$ is estimated by $\hat{\beta}=(X^{T}X)^{-1}X^{T}y.$ Denote $P=X(X^{T}X)^{-1}X^{T},$ $Q=I-P.$ Using (1) we see that $\hat{\beta}=\beta +(X^{T}X)^{-1}X^{T}e$ and the residual $r\equiv y-X\hat{\beta}=Qe.$ $\sigma^{2}$ is estimated by

(3) $s^{2}=\left\Vert r\right\Vert ^{2}/\left( n-k\right) =\left\Vert Qe\right\Vert ^{2}/\left( n-k\right) .$

$Q$ is a projector and has properties which are derived from those of $P$

(4) $Q^{T}=Q,$ $Q^{2}=Q.$

If $\lambda$ is an eigenvalue of $Q,$ then multiplying $Qx=\lambda x$ by $Q$ and using the fact that $x\neq 0$ we get $\lambda ^{2}=\lambda .$ Hence eigenvalues of $Q$ can be only $0$ or $1.$ The equation $tr\left( Q\right) =n-k$
tells us that the number of eigenvalues equal to 1 is $n-k$ and the remaining $k$ are zeros. Let $Q=U\Lambda U^{T}$ be the diagonal representation of $Q.$ Here $U$ is an orthogonal matrix,

(5) $U^{T}U=I,$

and $\Lambda$ is a diagonal matrix with eigenvalues of $Q$ on the main diagonal. We can assume that the first $n-k$ numbers on the diagonal of $Q$ are ones and the others are zeros.

Theorem. Let $e$ be normal. 1) $s^{2}\left( n-k\right) /\sigma ^{2}$ is distributed as $\chi _{n-k}^{2}.$ 2) The estimators $\hat{\beta}$ and $s^{2}$ are independent.

Proof. 1) We have by (4)

(6) $\left\Vert Qe\right\Vert ^{2}=\left( Qe\right) ^{T}Qe=\left( Q^{T}Qe\right) ^{T}e=\left( Qe\right) ^{T}e=\left( U\Lambda U^{T}e\right) ^{T}e=\left( \Lambda U^{T}e\right) ^{T}U^{T}e.$

Denote $S=U^{T}e.$ From (2) and (5)

$ES=0,$ $Var\left( S\right) =EU^{T}ee^{T}U=\sigma ^{2}U^{T}U=\sigma ^{2}I$

and $S$ is normal as a linear transformation of a normal vector. It follows that $S=\sigma z$ where $z$ is a standard normal vector with independent standard normal coordinates $z_{1},...,z_{n}.$ Hence, (6) implies

(7) $\left\Vert Qe\right\Vert ^{2}=\sigma ^{2}\left( \Lambda z\right) ^{T}z=\sigma ^{2}\left( z_{1}^{2}+...+z_{n-k}^{2}\right) =\sigma ^{2}\chi _{n-k}^{2}.$

(3) and (7) prove the first statement.

2) First we note that the vectors $Pe,Qe$ are independent. Since they are normal, their independence follows from

$cov(Pe,Qe)=EPee^{T}Q^{T}=\sigma ^{2}PQ=0.$

It's easy to see that $X^{T}P=X^{T}.$ This allows us to show that $\hat{\beta}$ is a function of $Pe$:

$\hat{\beta}=\beta +(X^{T}X)^{-1}X^{T}e=\beta +(X^{T}X)^{-1}X^{T}Pe.$

Independence of $Pe,Qe$ leads to independence of their functions $\hat{\beta}$ and $s^{2}.$