10
Jan 16

## What is a z score: the scientific explanation

You know what is a z score when you know why people invented it.

As usual, we start with a theoretical motivation. There is a myriad of distributions. Even if we stay within the set of normal distributions, there is an infinite number of them, indexed by their means $\mu(X)=EX$ and standard deviations $\sigma(X)=\sqrt{Var(X)}$. When computers did not exist, people had to use statistical tables. It was impossible to produce statistical tables for an infinite number of distributions, so the problem was to reduce the case of general $\mu(X)$ and $\sigma(X)$ to that of $\mu(X)=0$ and $\sigma(X)=1$.

But we know that that can be achieved by centering and scaling. Combining these two transformations, we obtain the definition of the z score:

$z=\frac{X-\mu(X)}{\sigma(X)}.$

Using the properties of means and variances we see that

$Ez=\frac{E(X-\mu(X))}{\sigma(X)}=0,$

$Var(z)=\frac{Var(X-\mu(X))}{\sigma^2(X)}=\frac{Var(X)}{\sigma^2(X)}=1.$

The transformation leading from $X$ to its z score sometimes is called standardization.

This site promises to tell you the truth about undergraduate statistics. The truth about the z score is that:

(1) Standardization can be applied to any variable with finite variance, not only to normal variables. The z score is a standard normal variable only when the original variable $X$ is normal, contrary to what some sites say.

(2) With modern computers, standardization is not necessary to find critical values for $X$, see Chapter 14 of my book.

9
Jan 16

## Scaling a distribution

Scaling a distribution is as important as centering or demeaning considered here. The question we want to find an answer for is this: What can you do to a random variable $X$ to obtain another random variable, say, $Y$, whose variance is one? Like in case of centering, geometric considerations can be used but I want to follow the algebraic approach, which is more powerful.

Hint: in case of centering, we subtract the mean, $Y=X-EX$. For the problem at hand the suggestion is to use scaling: $Y=aX$, where $a$ is a number to be determined.

Using the fact that variance is homogeneous of degree 2, we have

$Var(Y)=Var(aX)=a^2Var(X)$.

We want $Var(Y)$ to be 1, so solving for $a$ gives $a=1/\sqrt{Var(X)}=1/\sigma(X)$. Thus, division by the standard deviation answers our question: the variable $Y=X/\sigma(X)$ has variance and standard deviation equal to 1.

Note. Always use the notation for standard deviation $\sigma$ with its argument $X$.

7
Jan 16

## Mean plus deviation-from-mean decomposition

This is about separating the deterministic and random parts of a variable. This topic can be difficult or easy, depending on how you look at it. The right way to think about it is theoretical.

Everything starts with a simple question: What can you do to a random variable $X$ to obtain a new variable, say, $Y$, whose mean is equal to zero? Intuitively, when you subtract the mean from $X$, the distribution moves to the left or right, depending on the sign of $EX$, so that the distribution of $Y$ is centered on zero. One of my students used this intuition to guess that you should subtract the mean: $Y=X-EX$. The guess should be confirmed by algebra: from this definition

$EY=E(X-EX)=EX-E(EX)=EX-EX=0$

(here we distributed the expectation operator and used the property that the mean of a constant ($EX$) is that constant). By the way, subtracting the mean from a variable is called centering or demeaning.

If you understand the above, you can represent $X$ as

$X = EX+(X-EX).$

Here $\mu=EX$ is the mean and $u=X-EX$ is the deviation from the mean. As was shown above, $Eu=0$. Thus, we obtain the mean plus deviation-from-mean decomposition $X=\mu+u.$ Simple, isn't it? It is so simple, that students don't pay attention to it. In fact, it is omnipresent in Statistics because $Var(X)=Var(u)$. The analysis of $Var(X)$ is reduced to that of $Var(u)$!