Aug 16

The pearls of AP Statistics 22

The law of large numbers - a bird's view

They say: In 1689, the Swiss mathematician Jacob Bernoulli proved that as the number of trials increases, the proportion of occurrences of any given outcome approaches a particular number (such as 1/6) in the long run. (Agresti and Franklin, p.213).

I say: The expression “law of large numbers” appears in the book 13 times, yet its meaning is never clearly explained. The closest approximation to the truth is the above sentence about Jacob Bernoulli. To see if this explanation works, tell it to your students and ask what they understood. To me, this is a clear case when withholding theory harms understanding.

Intuition comes first. I ask my students: if you flip a fair coin 100 times, what do you expect the proportion of ones to be? Absolutely everybody replies correctly, just the form of the answer may be different (50-50 or 0.5 or 50 out of 100). Then I ask: probably it will not be exactly 0.5 but if you flip the coin 1000 times, do you expect the proportion to be closer to 0.5? Everybody says: Yes. Next I ask: Suppose the coin is unfair and the probability of 1 appearing is 0.7. What would you expect the proportion to be close to in large samples? Most students come up with the right answer: 0.7. Congratulations, you have discovered what is called a law of large numbers!

Then we give a theoretical format to our little discovery. p=0.7 is a population parameter. Flipping a coin n times we obtain observations X_1,...,X_n. The proportion of ones is the sample mean \bar{X}=\frac{X_1+...+X_n}{n}. The law of large numbers says two things: 1) as the sample size increases, the sample mean approaches the population mean. 2) At the same time, its variation about the population mean becomes smaller and smaller.

Part 1) is clear to everybody. To corroborate statement 2), I give two facts. Firstly, we know that the standard deviation of the sample mean is \frac{\sigma}{\sqrt{n}}. From this we see that as n increases, the standard deviation of the sample mean decreases and the values of the sample mean become more and more concentrated around the population mean. We express this by saying that the sample mean converges to a spike. Secondly, I produce two histograms. With the sample size n=100, there are two modes (just 1o%) of the histogram at 0.69 and 0.72, while 0.7 was used as the population mean in my simulations. Besides, the spread of the values is large. With n=1000, the mode (27%) is at the true value 0.7, and the spread is low.

Histogram of proportions with n=100


Histogram of proportions with n=1000

Finally, we relate our little exercise to practical needs. In practice, the true mean is never known. But we can obtain a sample and calculate its mean. With a large sample size, the sample mean will be close to the truth. More generally, take any other population parameter, such as its standard deviation, and calculate the sample statistic that estimates it, such as the sample standard deviation. Again, the law of large numbers applies and the sample statistic will be close to the population parameter. The histograms have been obtained as explained here and here. Download the Excel file.

Jul 16

The pearls of AP Statistics 4

The choice of the definition matters: numerical versus categorical variable

They say: A variable is called categorical if each observation belongs to one of a set of categories. A variable is called quantitative if observations on it take numerical values that represent different magnitudes of the variable (Agresti and Franklin, p.25)

definitionI say: Not all definitions are created equal. The definition you stick to must be short, easy to remember and easy to apply. My suggestion: first say that we call numerical variables those variables that take numerical values and then add that all other variables are called categorical. In the paragraph immediately preceding the above definition, the authors have this idea. However, they choose a less transparent definition. Not a big deal, but in a book that is 800+ pages this matters.

In case of more complex notions the choice of the definition becomes critical. Definitions not only give names to objects but they also give direction to the theory and reflect the researcher’s point of view. Often understanding definitions well allows you to guess or even prove some results.

For the benefit of better students you can also tell the following. Math consists of definitions, axioms and statements. Definitions are simply names of objects. They don’t require proofs. Axioms (also called postulates) are statements that we take for granted; they don’t require proofs and are in the very basement of a theory. Statements have to be proved.

Dec 15

Simulating the binomial variable - Exercise 2.2

Simulating the binomial variable in Excel and deriving its distribution

This topic is really an important part of introductory Statistics. Exercise 2.2 is designed to model the binomial variable in Excel. As you can notice, sometimes I don't follow my book word-for-word.

Simulation steps

  1. A combination of the Excel commands IF and RAND produces a Bernoulli variable (a coin)
  2. By definition, the binomial is a sum of coins. I think my definition is the easiest to apply
  3. To conclude, we give the definition of the coin in a tabular form.

This exercise is a good preparatory step for logical analysis of the binomial variable with three coins.

  1. Draw a table with four columns: three for the coins and one for their sum.
  2. Fill out the first line with one realization of coin values (say, three zeros) and their sum.
  3. Ask the students to fill out the other lines, with all possible combinations of the results for the coins.
  4. Then draw a new table where the outcomes are grouped by the numbers in the last column. This is the distribution of the binomial variable.

From here you can go to the general case. Mathematically inclined students will need a series of examples to see the importance of the binomial variable.


Figure 1. Excel file - click to view video