2
Oct 16

## The pearls of AP Statistics 31

Demystifying sampling distributions: too much talking about nothing

### What we know about sample means

Let $X_1,...,X_n$ be an independent identically distributed sample and consider its sample mean $\bar{X}$.

Fact 1. The sample mean is an unbiased estimator of the population mean:

(1) $E\bar{X}=\frac{1}{n}(EX_1+...+EX_n)=\frac{1}{n}(\mu+...+\mu)=\mu$

(use linearity of means).

Fact 2. Variance of the sample mean is

(2) $Var(\bar{X})=\frac{1}{n^2}(Var(X_1)+...+Var(X_n)=\frac{1}{n^2}(\sigma^2(X)+...+\sigma^2(X))=\frac{\sigma^2(X)}{n}$

(use homogeneity of variance of degree 2 and additivity of variance for independent variables). Hence $\sigma(\bar{X})=\frac{\sigma(X)}{\sqrt{n}}$

Fact 3. The implication of these two properties is that the sample mean becomes more concentrated around the population mean as the sample size increases (see at least the law of large numbers; I have a couple more posts about this).

Fact 4. Finally, the z scores of sample means stabilize to a standard normal distribution (the central limit theorem).

### What is a sampling distribution?

The sampling distribution of a statistic is the probability distribution that specifies probabilities for the possible values the statistic can take (Agresti and Franklin, p.308). After this definition, the authors go ahead and discuss the above four facts. Note that none of them requires the knowledge of what the sampling distribution is. The ONLY sampling distribution that appears explicitly in AP Statistics is the binomial. However, in the book the binomial is given in Section 6.3, before sampling distributions, which are the subject of Chapter 7. Section 7.3 explains that the binomial is a sampling distribution but that section is optional. Thus the whole Chapter 7 (almost 40 pages) is redundant.

### Then what are sampling distributions for?

Here is a simple example that explains their role. Consider the binomial $X_1+X_2$ of two observations on an unfair coin. It involves two random variables and therefore is described by a joint distribution with the sample space consisting of pairs of values

Table 1. Sample space for pair $(X_1,X_2)$

 Coin 1 0 1 Coin 2 0 (0,0) (0,1) 1 (1,0) (1,1)

Each coin independently takes values 0 and 1 (shown in the margins); the sample space contains four pairs of these values (shown in the main body of the table). The corresponding probability distribution is given by the table

Table 2. Joint probabilities for pair $(X_1,X_2)$

 Coin 1 p q Coin 2 p $p^2$$p^2$ $pq$$pq$ q $pq$$pq$ $q^2$$q^2$

Since we are counting only the number of successes, the outcomes (0,1) and (1,0) for the purposes of our experiment are the same. Hence, joining indistinguishable outcomes, we obtain a smaller sample space

Table 3. Sampling distribution for binomial $X_1+X_2$

 # of successes Corresponding probabilities 0 $p^2$$p^2$ 1 $2pq$$2pq$ 2 $q^2$$q^2$

The last table is the sampling distribution for the binomial with sample size 2. All the sampling distribution does is replace a large joint distribution Table 1+Table 2 by a smaller distribution Table 3. The beauty of proofs of equations (1) and (2) is that they do not depend on which distribution is used (the distribution is hidden in the expected value operator).

Unless you want your students to appreciate the reduction in the sample space brought about by sampling distributions, it is not worth discussing them. See Wikipedia for examples other than the binomial.

13
Aug 16

## The pearls of AP Statistics 19

Make it easy, keep 'em busy - law of large numbers illustrated

They say: Use the "Simulating the Probability of Head With a Fair Coin" applet on the text CD or other software to illustrate the long-run definition of probability by simulating short-term and long-term results of flipping a balanced coin (Agresti and Franklin, p.216)

I say: Since it is not explained how to use "other software", the readers are in fact forced to use the applet from the text CD. This is nothing but extortion. This exercise requires the students to write down 30 observations (which is not so bad, I also practice this in a more entertaining way). But the one following it invites them to simulate 100 observations. The two exercises are too simple, time consuming and their sole purpose is to illustrate the law of large numbers introduced on p. 213. The students have but one life and the authors want them to lose it for AP Statistics?

The same objective can be achieved in a more efficient and instructive manner. The students are better off if they see how to model the law of large numbers in Excel. As I promised, here is the explanation.

Law of large numbers - click to view video

Here is the Excel file.

27
Jul 16

## The pearls of AP Statistics 4

The choice of the definition matters: numerical versus categorical variable

They say: A variable is called categorical if each observation belongs to one of a set of categories. A variable is called quantitative if observations on it take numerical values that represent different magnitudes of the variable (Agresti and Franklin, p.25)

I say: Not all definitions are created equal. The definition you stick to must be short, easy to remember and easy to apply. My suggestion: first say that we call numerical variables those variables that take numerical values and then add that all other variables are called categorical. In the paragraph immediately preceding the above definition, the authors have this idea. However, they choose a less transparent definition. Not a big deal, but in a book that is 800+ pages this matters.

In case of more complex notions the choice of the definition becomes critical. Definitions not only give names to objects but they also give direction to the theory and reflect the researcher’s point of view. Often understanding definitions well allows you to guess or even prove some results.

For the benefit of better students you can also tell the following. Math consists of definitions, axioms and statements. Definitions are simply names of objects. They don’t require proofs. Axioms (also called postulates) are statements that we take for granted; they don’t require proofs and are in the very basement of a theory. Statements have to be proved.

16
Jan 16

## What is a binomial random variable - analogy with market demand

On the Internet you can find a number of definitions of a binomial random variable, see Wikipedia, Stat Trek or PennState, among others. None of them seems to me as intuitive as the one provided here. The definition is given in three steps.

Step 1. Everybody knows what is a coin: it takes values $1$ and $0$ with equal probabilities $1/2$ and $1/2$. With this information, it is easy to understand what is an unfair coin: it takes value $1$ with probability $p$ and $0$ with probability $1-p$, where $p$ is some number between $0$ and $1$. This definition describes what happens when we toss an unfair coin once.

Step 2. Now let us toss the coin $n$ times and count the number of successes (getting $1$ means success and getting $0$ means failure). The random variable that describes the number of successes is called a binomial variable. There is little you can do with this definition; we need to make one more step. Let us denote $B_n$ the binomial variable and let $C_1$, ..., $C_n$ be the outcomes on the coins. The fundamental fact suggested by the procedure of counting the number of successes is that $B_n=C_1+...+C_n$.

To illustrate this equation, consider the case $n=2$. There are $4$ possible combinations of the outcomes on the two coins: 1) $C_1=0,\ C_2=0$, 2) $C_1=0,\ C_2=1$, 3) $C_1=1,\ C_2=0$ and 4) $C_1=1,\ C_2=1$. Plug the coin values in the equation $B_2=C_1+C_2$, and you will see that in each case the equation is true.

Regarding our experiment of tossing the coin $n$ times, two remarks are in order: 1) obviously, the outcomes are independent and 2) the coins are identically distributed in the sense that the probability $p$ does not change throughout the experiment.

Step 3. It is possible to give different (and equivalent) definitions for the same thing. The one that takes the bull by the horns and can be directly applied is called a working definition. For the binomial random variable, the working definition is this: it is a sum of independent identically distributed unfair coins. That is, you write $B_n=C_1+...+C_n$ and then specify that the coins $C_1,...,C_n$ are independent and have the same $p$.

Every Economics student knows that the market demand is equal to the sum of individual demands: $D_{market}=D_1+...+D_n$. The definition of the binomial variable is a perfect analog of this fact. Sums of random variables are omnipresent in Statistics and Theory of Probabilities. By omitting the working definition of the binomial variable, elementary Statistics textbooks, including AP Stats and Business Stats, miss the essence of Statistics.