9
Sep 18

## Applications of the diagonal representation II ## Applications of the diagonal representation II

### 4. Square root of a matrix

Definition 1. For a symmetric matrix with non-negative eigenvalues the square root is defined by

(1) $A^{1/2}=Udiag[\sqrt{\lambda_1},...,\sqrt{\lambda_n}]U^{-1}.$

Exercise 1. (1) is symmetric and satisfies $(A^{1/2})^2=A.$

Proof. By properties of orthogonal matrices $(A^{1/2})^T=(U^{-1})^Tdiag[\sqrt{\lambda_1},...,\sqrt{\lambda_n}]U^T=A^{1/2},$ $(A^{1/2})^2=Udiag[\sqrt{\lambda_1},...,\sqrt{\lambda_n}]U^{-1}Udiag[\sqrt{\lambda_1},...,\sqrt{\lambda_n}]U^{-1}$ $=Udiag[\lambda_1,...,\lambda_n]U^{-1}=A.$

### 5. Generalized least squares estimator

The error term $e$ in the multiple regression $y=X\beta +e$ under homoscedasticity and in absence of autocorrelation satisfies

(2) $V(e)=\sigma^2I,$ where $\sigma^2$ is some positive number.

The OLS estimator in this situation is given by

(3) $\hat{\beta}=(X^TX)^{-1}X^Ty.$

Now consider a more general case $V(e)=\Omega.$

Exercise 2. The variance matrix $V(e)=\Omega$ is always symmetric and non-negative.

Proof $V(e)^T=[E(e-Ee)(e-Ee)^T]^T=V(e),$ $x^TV(e)x=Ex^T(e-Ee)(e-Ee)^Tx=E\|(e-Ee)^Tx\|^2\geq 0.$

Exercise 3. Let's assume that $\Omega$ is positive. Show that $\Omega^{-1/2}$ is symmetric and satisfies $(\Omega^{-1/2})^2=\Omega^{-1}.$

Proof. By Exercise 1 the eigenvalues of $\Omega$ are positive. Hence its inverse $\Omega^{-1}$ exists and is given by $\Omega^{-1}=U\Omega_U^{-1}U^T$ where $\Omega_U^{-1}=diag[\lambda_1^{-1},...,\lambda_n^{-1}].$ It is symmetric as an inverse of a symmetric matrix. It remains to apply Exercise 1 to $A=\Omega^{-1/2}.$

Exercise 4. Find the variance of $u=\Omega^{-1/2}e$.

Solution. Using the definition of variance of a vector $V(u)=E(u-Eu)(u-Eu)^T=\Omega^{-1/2}V(e)(\Omega^{-1/2})^T=\Omega^{-1/2}\Omega\Omega^{-1/2}=I.$

Exercise 4 suggests how to transform $y=X\beta +e$ to satisfy (2). In the equation $\Omega^{-1/2}y=\Omega^{-1/2}X\beta +\Omega^{-1/2}e$

the error $u=\Omega^{-1/2}e$ satisfies the assumption under which (2) is applicable. Let $\tilde{y}=\Omega^{-1/2}y,$ $\tilde{X}=\Omega^{-1/2}X.$ Then we have $\tilde{y}=\tilde{X}\beta +u$ and from (3) $\hat{\beta}=(\tilde{X}^T\tilde{X})^{-1}\tilde{X}^T\tilde{y}.$ Since $\tilde{X}^T=X^T\Omega^{-1/2},$ this can be written as $\hat{\beta}=(X^T\Omega^{-1/2}\Omega^{-1/2}X)^{-1}X^T\Omega^{-1/2}\Omega^{-1/2}y=(X^T\Omega^{-1}X)^{-1}X^T\Omega^{-1}y.$