## Number of reads of my book on October 29, 2020

For more information see my site

26

May 20

Tags: ANOVA, AP Statistics, Business Statistics, discrete and continuous variables, estimation, graphical and numerical description of data, hypothesis testing, Maximum likelihood method, sampling and sampling distribution, simple regression, University of London International Programmes, uol affiliate centre, UoL International Programmes

14

Dec 16

The suggestions below are based on the College Board AP Statistics Course Description, Effective Fall 2010. Citing this description, “AP teachers are encouraged to develop or maintain their own curriculum that either includes or exceeds each of these expectations; such courses will be authorized to use the “AP” designation.” However, AP teachers are constrained by the statement that “The Advanced Placement Program offers a course description and exam in statistics to secondary school students who wish to complete studies equivalent to a *one semester*, *introductory*, *non-calculus-based*, college course in statistics.”

I tried to teach AP Stats in one semester following the College Board description and methodology. That is, with no derivations, giving only recipes and concentrating on applications. The students were really stretched, didn’t remember anything after completing the course, and usefulness of the course for the subsequent calculus-based course was minimal.

**Suggestion**. Reduce the number of topics and concentrate on those, which require going all the way from (again citing the description) Exploring Data to Sampling and Experimentation to Anticipating Patterns to Statistical Inference. Simple regression is such a topic.

I would drop the stem-and-leaf plot, because it is stupid; chi-square test for goodness of fit, homogeneity of proportions and independence, including ANOVA, because it is too advanced and looks too vague without the right explanation. Instead of going wide, it is better to go deeper, building upon what students already know. I’ll post a couple of regression applications.

Statistics has its specifics. Even I, with my extensive experience in Math, made quite a few discoveries for myself while studying Stats. Textbook authors, in their attempts to make exposition accessible, often replace the true statistical ideas by after-the-fact intuition or formulas by their verbal description. See, for example, the z score.

Using TI-83+ and TI-84 graphing calculators is like using a Tesla electric car in conjunction with candles for generating electricity. The sole purpose of these calculators is to prevent cheating. The inclination for cheating is a sign of low understanding and the best proof that the College Board strategy is wrong.

When we format a document in Word, we don’t care how formatting is implemented technically and we don’t need to know anything about programming. Looks like the same attitude is imparted to students of Stats. Few people notice a big difference. When we format a document, we have an idea of what we want and test the result against that idea. In Stats, the idea has to be translated to a formula, and the software output has to be translated into a formula for interpretation.

I understand that, for the majority of Stats students, the amount of algebra I use in some of my posts is not accessible. However, the opposite tendency of telling students that they don’t need to remember any formulas is unproductive. It’s only by memorizing and reproducing equations that they can augment their algebraic proficiency. Stats is largely a mental science. To improve mental activity, you have to engage in one.

**Suggestion**. Instead of “this course is non-calculus-based”, say: the course develops the ability to interpret equations and translate ideas to formulas.

The way most AP Stats books are written does not give any idea as to what comes from where. When I was a bachelor student, I was looking for explanations, and I would hate reading one of today’s AP Stats textbooks. For those who think, memorizing a bunch of recipes, without seeing the logical links, is a nightmare. In some cases, the absence of logic leads to statements that are plain wrong. Just following the logical sequence will put the pieces of the puzzle together.

20

Nov 16

ANOVA: the artefact that survives because of the College Board

- The common argument in favor of using ANOVA is that "The methods introduced in this chapter [Comparing Groups: Analysis of Variance Methods] apply when a quantitative response variable has a categorical explanatory variable" (Agresti and Franklin, p. 680). However, categorical explanatory variables can be replaced by indicator (dummy) variables, and then regression methods can be used to study dependences involving categorical variables. On p. 695, the authors admit that "ANOVA can be presented as a special case of multiple regression".
- In terms of knowledge of basic statistical ideas (hypothesis testing, F statistics, significance level), ANOVA doesn't add any value. Those, who have mastered these basic ideas, will not have problems learning ANOVA at their workplace if they have to. There is no need to burden everybody with this stuff "just in case".
- The explanation of ANOVA is accompanied with definitions of the within-groups variance estimate and between-groups variance estimate (Agresti and Franklin, p. 686). Even in my courses, where I give a lot of algebra, the students don't get them unless they do a couple of theoretical exercises. At the AP Stats level, the usefulness of these definitions is nil.
- The requirement to remember how the F statistics and degrees of freedom are calculated, for the purpose of being able to interpret just one table with output from a statistical package, doesn't make sense. In my book, I have a whole chapter on ANOVA, with most derivations, and I don't remember a thing. Why torture the students?
- In the 90 years since R. Fisher has invented ANOVA, many other, more precise and versatile, statistical methods have been developed.

There are two suggestions

1) Explain just the intuition and then jump to the interpretation of output, indicating the statistic to look at, as in Table 14.14.

2) The theory of ANOVA is useful for two reasons: there is a lot of manipulation with summation signs and there is a link to regressions. Learning all this may be the only justification to study ANOVA with definitions. In my classes, this takes 6 hours.