17
Mar 19

## AP Statistics the Genghis Khan way 2

Last semester I tried to explain theory through numerical examples. The results were terrible. Even the best students didn't stand up to my expectations. The midterm grades were so low that I did something I had never done before: I allowed my students to write an analysis of the midterm at home. Those who were able to verbally articulate the answers to me received a bonus that allowed them to pass the semester.

This semester I made a U-turn. I announced that in the first half of the semester we will concentrate on theory and we followed this methodology. Out of 35 students, 20 significantly improved their performance and 15 remained where they were.

### Midterm exam, version 1

#### 1. General density definition (6 points)

a. Define the density $p_X$ of a random variable $X.$ Draw the density of heights of adults, making simplifying assumptions if necessary. Don't forget to label the axes.

b. According to your plot, how much is the integral $\int_{-\infty}^0p_X(t)dt?$ Explain.

c. Why the density cannot be negative?

d. Why the total area under the density curve should be 1?

e. Where are basketball players on your graph? Write down the corresponding expression for probability.

f. Where are dwarfs on your graph? Write down the corresponding expression for probability.

This question is about the interval formula. In each case students have to write the equation for the probability and the corresponding integral of the density. At this level, I don't talk about the distribution function and introduce the density by the interval formula.

#### 2. Properties of means (8 points)

a. Define a discrete random variable and its mean.

b. Define linear operations with random variables.

c. Prove linearity of means.

d. Prove additivity and homogeneity of means.

e. How much is the mean of a constant?

f. Using induction, derive the linearity of means for the case of $n$ variables from the case of two variables (3 points).

#### 3. Covariance properties (6 points)

a. Derive linearity of covariance in the first argument when the second is fixed.

b. How much is covariance if one of its arguments is a constant?

c. What is the link between variance and covariance? If you know one of these functions, can you find the other (there should be two answers)? (4 points)

#### 4. Standard normal variable (6 points)

a. Define the density $p_z(t)$ of a standard normal.

b. Why is the function $p_z(t)$ even? Illustrate this fact on the plot.

c. Why is the function $f(t)=tp_z(t)$ odd? Illustrate this fact on the plot.

d. Justify the equation $Ez=0.$

e. Why is $V(z)=1?$

f. Let $t>0.$ Show on the same plot areas corresponding to the probabilities $A_1=P(0 $A_2=P(z>t),$ $A_3=P(z<-t),$ $A_4=P(-t Write down the relationships between $A_1,...,A_4.$

#### 5. General normal variable (3 points)

a. Define a general normal variable $X.$

b. Use this definition to find the mean and variance of $X.$

c. Using part b, on the same plot graph the density of the standard normal and of a general normal with parameters $\sigma =2,$ $\mu =3.$

### Midterm exam, version 2

#### 1. General density definition (6 points)

a. Define the density $p_X$ of a random variable $X.$ Draw the density of work experience of adults, making simplifying assumptions if necessary. Don't forget to label the axes.

b. According to your plot, how much is the integral $\int_{-\infty}^0p_X(t)dt?$ Explain.

c. Why the density cannot be negative?

d. Why the total area under the density curve should be 1?

e. Where are retired people on your graph? Write down the corresponding expression for probability.

f. Where are young people (up to 25 years old) on your graph? Write down the corresponding expression for probability.

#### 2. Variance properties (8 points)

a. Define variance of a random variable. Why is it non-negative?

b. Define the formula for variance of a linear combination of two variables.

c. How much is variance of a constant?

d. What is the formula for variance of a sum? What do we call homogeneity of variance?

e. What is larger: $V(X+Y)$ or $V(X-Y)$? (2 points)

f. One investor has 100 shares of Apple, another - 200 shares. Which investor's portfolio has larger variability? (2 points)

#### 3. Poisson distribution (6 points)

a. Write down the Taylor expansion and explain the idea. How are the Taylor coefficients found?

b. Use the Taylor series for the exponential function to define the Poisson distribution.

c. Find the mean of the Poisson distribution. What is the interpretation of the parameter $\lambda$ in practice?

#### 4. Standard normal variable (6 points)

a. Define the density $p_z(t)$ of a standard normal.

b. Why is the function $p_z(t)$ even? Illustrate this fact on the plot.

c. Why is the function $f(t)=tp_z(t)$ odd? Illustrate this fact on the plot.

d. Justify the equation $Ez=0.$

e. Why is $V(z)=1?$

f. Let $t>0.$ Show on the same plot areas corresponding to the probabilities $A_1=P(0 $A_2=P(z>t),$ $A_{3}=P(z<-t),$ $A_4=P(-t Write down the relationships between $A_{1},...,A_{4}.$

#### 5. General normal variable (3 points)

a. Define a general normal variable $X.$

b. Use this definition to find the mean and variance of $X.$

c. Using part b, on the same plot graph the density of the standard normal and of a general normal with parameters $\sigma =2,$ $\mu =3.$

4
Nov 18

## Little tricks for AP Statistics

This year I am teaching AP Statistics. If the things continue the way they are, about half of the class will fail. Here is my diagnosis and how I am handling the problem.

On the surface, the students lack algebra training but I think the problem is deeper: many of them have underdeveloped cognitive abilities. Their perception is slow, memory is limited, analytical abilities are rudimentary and they are not used to work at home. Limited resources require  careful allocation.

### Terminology

Short and intuitive names are better than two-word professional names.

Instead of "sample space" or "probability space" say "universe". The universe is the widest possible event, and nothing exists outside it.

Instead of "elementary event" say "atom". Simplest possible events are called atoms. This corresponds to the theoretical notion of an atom in measure theory (an atom is a measurable set which has positive measure and contains no set of smaller positive measure).

Then the formulation of classical probability becomes short. Let $n$ denote the number of atoms in the universe and let $n_A$ be the number of atoms in event $A.$ If all atoms are equally likely (have equal probabilities), then $P(A)=n_A/n.$

The clumsy "mutually exclusive events" are better replaced by more visual "disjoint sets". Likewise, instead of "collectively exhaustive events" say "events that cover the universe".

The combination "mutually exclusive" and "collectively exhaustive" events is beyond comprehension for many. I say: if events are disjoint and cover the universe, we call them tiles. To support this definition, play onscreen one of jigsaw puzzles (Video 1) and produce the picture from Figure 1.

Video 1. Tiles (disjoint events that cover the universe)

Figure 1. Tiles (disjoint events that cover the universe)

### The philosophy of team work

We are in the same boat. I mean the big boat. Not the class. Not the university. It's the whole country. We depend on each other. Failure of one may jeopardize the well-being of everybody else.

You work in teams. You help each other to learn. My lectures and your presentations are just the beginning of the journey of knowledge into your heads. I cannot control how it settles there. Be my teaching assistants, share your big and little discoveries with your classmates.

I don't just preach about you helping each other. I force you to work in teams. 30% of the final grade is allocated to team work. Team work means joint responsibility. You work on assignments together. I randomly select a team member for reporting. His or her grade is what each team member gets.

This kind of team work is incompatible with the Western obsession with grades privacy. If I say my grade is nobody's business, by extension I consider the level of my knowledge a private issue. This will prevent me from asking for help and admitting my errors. The situation when students hide their errors and weaknesses from others also goes against the ethics of many workplaces. In my class all grades are public knowledge.

In some situations, keeping the grade private is technically impossible. Conducting a competition without announcing the points won is impossible. If I catch a student cheating, I announce the failing grade immediately, as a warning to others.

To those of you who think team-based learning is unfair to better students I repeat: 30% of the final grade is given for team work, not for personal achievements. The other 70% is where you can shine personally.

### Breaking the wall of silence

Team work serves several purposes.

Firstly, joint responsibility helps breaking communication barriers. See in Video 2 my students working in teams on classroom assignments. The situation when a weaker student is too proud to ask for help and a stronger student doesn't want to offend by offering help is not acceptable. One can ask for help or offer help without losing respect for each other.

Video 2. Teams working on assignments

Secondly, it turns on resources that are otherwise idle. Explaining something to somebody is the best way to improve your own understanding. The better students master a kind of leadership that is especially valuable in a modern society. For the weaker students, feeling responsible for a team improves motivation.

Thirdly, I save time by having to grade less student papers.

On exams and quizzes I mercilessly punish the students for Yes/No answers without explanations. There are no half-points for half-understanding. This, in combination with the team work and open grades policy allows me to achieve my main objective: students are eager to talk to me about their problems.

### Set operations and probability

After studying the basics of set operations and probabilities we had a midterm exam. It revealed that about one-third of students didn't understand this material and some of that misunderstanding came from high school. During the review session I wanted to see if they were ready for a frank discussion and told them: "Those who don't understand probabilities, please raise your hands", and about one-third raised their hands. I invited two of them to work at the board.

Video 3. Translating verbal statements to sets, with accompanying probabilities

Many teachers think that the Venn diagrams explain everything about sets because they are visual. No, for some students they are not visual enough. That's why I prepared a simple teaching aid (see Video 3) and explained the task to the two students as follows:

I am shooting at the target. The target is a square with two circles on it, one red and the other blue. The target is the universe (the bullet cannot hit points outside it). The probability of a set is its area. I am going to tell you one statement after another. You write that statement in the first column of the table. In the second column write the mathematical expression for the set. In the third column write the probability of that set, together with any accompanying formulas that you can come up with. The formulas should reflect the relationships between relevant areas.

Table 1. Set operations and probabilities

 Statement Set Probability 1. The bullet hit the universe $S$$S$ $P(S)=1$$P(S)=1$ 2. The bullet didn't hit the universe $\emptyset$$\emptyset$ $P(\emptyset )=0$$P(\emptyset )=0$ 3. The bullet hit the red circle $A$$A$ $P(A)$$P(A)$ 4. The bullet didn't hit the red circle $\bar{A}=S\backslash A$$\bar{A}=S\backslash A$ $P(\bar{A})=P(S)-P(A)=1-P(A)$$P(\bar{A})=P(S)-P(A)=1-P(A)$ 5. The bullet hit both the red and blue circles $A\cap B$$A\cap B$ $P(A\cap B)$$P(A\cap B)$ (in general, this is not equal to $P(A)P(B)$$P(A)P(B)$) 6. The bullet hit $A$$A$ or $B$$B$ (or both) $A\cup B$$A\cup B$ $P(A\cup B)=P(A)+P(B)-P(A\cap B)$$P(A\cup B)=P(A)+P(B)-P(A\cap B)$ (additivity rule) 7. The bullet hit $A$$A$ but not $B$$B$ $A\backslash B$$A\backslash B$ $P(A\backslash B)=P(A)-P(A\cap B)$$P(A\backslash B)=P(A)-P(A\cap B)$ 8. The bullet hit $B$$B$ but not $A$$A$ $B\backslash A$$B\backslash A$ $P(B\backslash A)=P(B)-P(A\cap B)$$P(B\backslash A)=P(B)-P(A\cap B)$ 9. The bullet hit either $A$$A$ or $B$$B$ (but not both) $(A\backslash B)\cup(B\backslash A)$$(A\backslash B)\cup(B\backslash A)$ $P\left( (A\backslash B)\cup (B\backslash A)\right)$$P\left( (A\backslash B)\cup (B\backslash A)\right)$ $=P(A)+P(B)-2P(A\cap B)$$=P(A)+P(B)-2P(A\cap B)$

During the process, I was illustrating everything on my teaching aid. This exercise allows the students to relate verbal statements to sets and further to their areas. The main point is that people need to see the logic, and that logic should be repeated several times through similar exercises.

8
Oct 17

## Reevaluating probabilities based on piece of evidence

This actually has to do with the Bayes' theorem. However, in simple problems one can use a dead simple approach: just find probabilities of all elementary events. This post builds upon the post on Significance level and power of test, including the notation. Be sure to review that post.

Here is an example from the guide for Quantitative Finance by A. Patton (University of London course code FN3142).

Activity 7.2 Consider a test that has a Type I error rate of 5%, and power of 50%.

Suppose that, before running the test, the researcher thinks that both the null and the alternative are equally likely.

1. If the test indicates a rejection of the null hypothesis, what is the probability that the null is false?

2. If the test indicates a failure to reject the null hypothesis, what is the probability that the null is true?

Denote events R = {Reject null}, A = {fAil to reject null}; T = {null is True}; F = {null is False}. Then we are given:

(1) $P(F)=0.5;\ P(T)=0.5;$

(2) $P(R|T)=\frac{P(R\cap T)}{P(T)}=0.05;\ P(R|F)=\frac{P(R\cap F)}{P(F)}=0.5;$

(1) and (2) show that we can find $P(R\cap T)$ and $P(R\cap F)$ and therefore also $P(A\cap T)$ and $P(A\cap F).$ Once we know probabilities of elementary events, we can find everything about everything.

Figure 1. Elementary events

Answering the first question: just plug probabilities in $P(F|R)=\frac{P(R\cap F)}{P(R)}=\frac{P(R\cap F)}{P(R\cap T)+P(A\cap T)}.$

Answering the second question: just plug probabilities in $P(T|A)=\frac{P(A\cap T)}{P(A)}=\frac{P(A\cap T)}{P(A\cap T)+P(A\cap F)}.$

Patton uses the Bayes' theorem and the law of total probability. The solution suggested above uses only additivity of probability.

6
Oct 17

## Significance level and power of test

In this post we discuss several interrelated concepts: null and alternative hypotheses, type I and type II errors and their probabilities. Review the definitions of a sample space and elementary events and that of a conditional probability.

## Type I and Type II errors

Regarding the true state of nature we assume two mutually exclusive possibilities: the null hypothesis (like the suspect is guilty) and alternative hypothesis (the suspect is innocent). It's up to us what to call the null and what to call the alternative. However, the statistical procedures are not symmetric: it's easier to measure the probability of rejecting the null when it is true than other involved probabilities. This is why what is desirable to prove is usually designated as the alternative.

Usually in books you can see the following table.

 Decision taken Fail to reject null Reject null State of nature Null is true Correct decision Type I error Null is false Type II error Correct decision

This table is not good enough because there is no link to probabilities. The next video does fill in the blanks.

Video. Significance level and power of test

## Significance level and power of test

The conclusion from the video is that

$\frac{P(T\bigcap R)}{P(T)}=P(R|T)=P\text{(Type I error)=significance level}$ $\frac{P(F\bigcap R)}{P(F)}=P(R|F)=P\text{(Correctly rejecting false null)=Power}$
10
Jul 17

## Alternatives to simple regression in Stata

In this post we looked at dependence of EARNINGS on S (years of schooling). In the end I suggested to think about possible variations of the model. Specifically, could the dependence be nonlinear? We consider two answers to this question.

This name is used for the quadratic dependence of the dependent variable on the independent variable. For our variables the dependence is

$EARNINGS=a+bS+cS^2+u$.

Note that the dependence on S is quadratic but the right-hand side is linear in the parameters, so we still are in the realm of linear regression. Video 1 shows how to run this regression.

Video 1. Running quadratic regression in Stata

## Nonparametric regression

The general way to write this model is

$y=m(x)+u.$

The beauty and power of nonparametric regression consists in the fact that we don't need to specify the functional form of dependence of $y$ on $x$. Therefore there are no parameters to interpret, there is only the fitted curve. There is also the estimated equation of the nonlinear dependence, which is too complex to consider here. I already illustrated the difference between parametric and nonparametric regression. See in Video 2 how to run nonparametric regression in Stata.

Video 2. Nonparametric dependence

6
Jul 17

## Running simple regression in Stata

Running simple regression in Stata is, well, simple. It's just a matter of a couple of clicks. Try to make it a small research.

1. Obtain descriptive statistics for your data (Statistics > Summaries, tables, and tests > Summary and descriptive statistics > Summary statistics). Look at all that stuff you studied in introductory statistics: units of measurement, means, minimums, maximums, and correlations. Knowing the units of measurement will be important for interpreting regression results; correlations will predict signs of coefficients, etc. In your report, don't just mechanically repeat all those measures; try to find and discuss something interesting.
2. Visualize your data (Graphics > Twoway graph). On the graph you can observe outliers and discern possible nonlinearity.
3. After running regression, report the estimated equation. It is called a fitted line and in our case looks like this: Earnings = -13.93+2.45*S (use descriptive names and not abstract X,Y). To see if the coefficient of S is significant, look at its p-value, which is smaller than 0.001. This tells us that at all levels of significance larger than or equal to 0.001 the null that the coefficient of S is significant is rejected. This follows from the definition of p-value. Nobody cares about significance of the intercept. Report also the p-value of the F statistic. It characterizes significance of all nontrivial regressors and is important in case of multiple regression. The last statistic to report is R squared.
4. Think about possible variations of the model. Could the dependence of Earnings on S be nonlinear? What other determinants of Earnings would you suggest from among the variables in Dougherty's file?

Figure 1. Looking at data. For data, we use a scatterplot.

Figure 2. Running regression (Statistics > Linear models and related > Linear regression)

29
Jun 17

## Introduction to Stata

Introduction to Stata: Stata interface, how to use Stata Help, how to use Data Editor and how to graph data. Important details to remember:

1. In any program, the first thing to use is Help. I learned everything from Help and never took any programming courses.
2. The number of observations for all variables in one data file must be the same. This can be a problem if, for example, you want to see out-of-sample predictions.
3. In Data Editor, numeric variables are displayed in black and strings are displayed in red.
4. The name of the hidden variable that counts observations is _n
5. If you have several definitions of graphs in two-way graphs menu, they will be graphed together or separately, depending on what is enabled/disabled.

See details in videos. Sorry about the background noise!

Video 1. Stata interface. The windows introduced: Results, Command, Variables, Properties, Review and Viewer.

Video 2. Using Stata Help. Help can be used through the Review window or in a separate pdf viewer. Eviews Help is much easier to understand.

Video 3. Using Data Editor. How to open and view variables, the visual difference between numeric variables and string variables. The lengths of all variables in the same file must be the same.

Video 4. Graphing data. To graph a variable, you need to define its graph and then display it. It is possible to display more than one variable on the same chart.

21
Feb 17

## The pearls of AP Statistics 37

### Confidence interval: attach probability or not attach?

I am reading "5 Steps to a 5 AP Statistics, 2010-2011 Edition" by Duane Hinders (sorry, I don't have the latest edition). The tip at the bottom of p.200 says:

For the exam, be VERY, VERY clear on the discussion above. Many students
seem to think that we can attach a probability to our interpretation of a confidence
interval. We cannot.

This is one of those misconceptions that travel from book to book. Below I show how it may have arisen.

### Confidence interval derivation

The intuition behind the confidence interval and the confidence interval derivation using z score have been given here. To make the discussion close to Duane Hinders, I show the confidence interval derivation using the t statistic. Let $X_1,...,X_n$ be a sample of independent observations from a normal population, $\mu$ the population mean and $s$ the standard error. Skipping the intuition, let's go directly to the t statistic

(1) $t=\frac{\bar{X}-\mu}{s/\sqrt{n}}$.

At the 95% confidence level, from statistical tables find the critical value $t_{cr,0.95}$ of the t statistic such that

$P(-t_{cr,0.95}

Plug here (1) to get

(2) $P(-t_{cr,0.95}<\frac{\bar{X}-\mu}{s/\sqrt{n}}

Using equivalent transformations of inequalities (multiplying them by $s/\sqrt{n}$ and adding $\mu$ to all sides) we rewrite (2) as

(3) $P(\mu-t_{cr,0.95}\frac{s}{\sqrt{n}}<\bar{X}<\mu+t_{cr,0.95}\frac{s}{\sqrt{n}})=0.95.$

Thus, we have proved

Statement 1. The interval $\mu\pm t_{cr,0.95}\frac{s}{\sqrt{n}}$ contains the values of the sample mean with probability 95%.

The left-side inequality in (3) is equivalent to $\mu<\bar{X}+t_{cr,0.95}\frac{s}{\sqrt{n}}$ and the right-side one is equivalent to $\bar{X}-t_{cr,0.95}\frac{s}{\sqrt{n}}<\mu$. Combining these two inequalities, we see that (3) can be equivalently written as

(4) $P(\bar{X}-t_{cr,0.95}\frac{s}{\sqrt{n}}<\mu<\bar{X}+t_{cr,0.95}\frac{s}{\sqrt{n}})=0.95.$

So, we have

Statement 2. The interval $\bar{X}\pm t_{cr,0.95}\frac{s}{\sqrt{n}}$ contains the population mean with probability 95%.

### Source of the misconception

In (3), the variable in the middle ($\bar{X}$) is random, and the statement that it belongs to some interval is naturally probabilistic. People not familiar with the above derivation don't understand how a statement that the population mean (which is a constant) belongs to some interval can be probabilistic. It's the interval ends that are random in (4) (the sample mean and standard error are both random), that's why there is probability! Statements 1 and 2 are equivalent!

My colleague Aidan Islyami mentioned that we should distinguish estimates from estimators.

In all statistical derivations random variables are ex-ante (before the event). No book says that but that's the way it is. An estimate is an ex-post (after the event) value of an estimator. An estimate is, of course, a number and not a random variable. Ex-ante, a confidence interval always has a probability. Ex-post, the fact that an estimate belongs to some interval is deterministic (has probability either 0 or 1) and it doesn't make sense to talk about 95%.

Since confidence levels are always strictly between 0 and 100%, students should keep in mind that we deal with ex-ante variables.
14
Dec 16

## It’s time to modernize the AP Stats curriculum

### It's time to modernize the AP Stats curriculum

The suggestions below are based on the College Board AP Statistics Course Description, Effective Fall 2010. Citing this description, “AP teachers are encouraged to develop or maintain their own curriculum that either includes or exceeds each of these expectations; such courses will be authorized to use the “AP” designation.” However, AP teachers are constrained by the statement that “The Advanced Placement Program offers a course description and exam in statistics to secondary school students who wish to complete studies equivalent to a one semester, introductory, non-calculus-based, college course in statistics.”

### Too much material for a one-semester course

I tried to teach AP Stats in one semester following the College Board description and methodology. That is, with no derivations, giving only recipes and concentrating on applications. The students were really stretched, didn’t remember anything after completing the course, and usefulness of the course for the subsequent calculus-based course was minimal.

Suggestion. Reduce the number of topics and concentrate on those, which require going all the way from (again citing the description) Exploring Data to Sampling and Experimentation to Anticipating Patterns to Statistical Inference. Simple regression is such a topic.

I would drop the stem-and-leaf plot, because it is stupid; chi-square test for goodness of fit, homogeneity of proportions and independence, including ANOVA, because it is too advanced and looks too vague without the right explanation. Instead of going wide, it is better to go deeper, building upon what students already know. I’ll post a couple of regression applications.

### “Introductory” should not mean stupefying

Statistics has its specifics. Even I, with my extensive experience in Math, made quite a few discoveries for myself while studying Stats. Textbook authors, in their attempts to make exposition accessible, often replace the true statistical ideas by after-the-fact intuition or formulas by their verbal description. See, for example, the z score.

Using TI-83+ and TI-84 graphing calculators is like using a Tesla electric car in conjunction with candles for generating electricity. The sole purpose of these calculators is to prevent cheating. The inclination for cheating is a sign of low understanding and the best proof that the College Board strategy is wrong.

### Once you say “this course is non-calculus-based”, you close many doors

When we format a document in Word, we don’t care how formatting is implemented technically and we don’t need to know anything about programming. Looks like the same attitude is imparted to students of Stats. Few people notice a big difference. When we format a document, we have an idea of what we want and test the result against that idea. In Stats, the idea has to be translated to a formula, and the software output has to be translated into a formula for interpretation.

I understand that, for the majority of Stats students, the amount of algebra I use in some of my posts is not accessible. However, the opposite tendency of telling students that they don’t need to remember any formulas is unproductive. It’s only by memorizing and reproducing equations that they can augment their algebraic proficiency. Stats is largely a mental science. To improve mental activity, you have to engage in one.

Suggestion. Instead of “this course is non-calculus-based”, say: the course develops the ability to interpret equations and translate ideas to formulas.

The way most AP Stats books are written does not give any idea as to what comes from where. When I was a bachelor student, I was looking for explanations, and I would hate reading one of today’s AP Stats textbooks. For those who think, memorizing a bunch of recipes, without seeing the logical links, is a nightmare. In some cases, the absence of logic leads to statements that are plain wrong. Just following the logical sequence will put the pieces of the puzzle together.

9
Dec 16

## Ditch statistical tables if you have a computer

You don't need statistical tables if you have Excel or Mathematica. Here I give the relevant Excel and Mathematica functions described in Chapter 14 of my book. You can save all the formulas in one spreadsheet or notebook and use it multiple times.

### Cumulative Distribution Function of the Standard Normal Distribution

For a given real $z$, the value of the distribution function of the standard normal is
$F(z)=\frac{1}{\sqrt{2\pi }}\int_{-\infty }^{z}\exp (-t^{2}/2)dt.$

In Excel, use the formula =NORM.S.DIST(z,TRUE).

In Mathematica, enter CDF[NormalDistribution[0,1],z]

### Probability Function of the Binomial Distribution

For given number of successes $x,$ number of trials $n$ and probability $p$ the probability is

$P(Binomial=x)=C_{x}^{n}p^{x}(1-p)^{n-x}$.

In Excel, use the formula =BINOM.DIST(x,n,p,FALSE)

In Mathematica, enter PDF[BinomialDistribution[n,p],x]

### Cumulative Binomial Probabilities

For a given cut-off value $x,$ number of trials $n$ and probability $p$ the cumulative probability is

$P(Binomial\leq x)=\sum_{t=0}^{x}C_{t}^{n}p^{t}(1-p)^{n-t}.$
In Excel, use the formula =BINOM.DIST(x,n,p,TRUE).

In Mathematica, enter CDF[BinomialDistribution[n,p],x]

### Values of the exponential function $e^{-\lambda}$$e^{-\lambda}$

In Excel, use the formula =EXP(-lambda)

In Mathematica, enter Exp[-lambda]

### Individual Poisson Probabilities

For given number of successes $x$ and arrival rate $\lambda$ the probability is

$P(Poisson=x)=\frac{e^{-\lambda }\lambda^{x}}{x!}.$
In Excel, use the formula =POISSON.DIST(x,lambda,FALSE)

In Mathematica, enter PDF[PoissonDistribution[lambda],x]

### Cumulative Poisson Probabilities

For given cut-off $x$ and arrival rate $\lambda$ the cumulative probability is

$P(Poisson\leq x)=\sum_{t=0}^{x}\frac{e^{-\lambda }\lambda ^{t}}{t!}.$
In Excel, use the formula =POISSON.DIST(x,lambda,TRUE)

In Mathematica, enter CDF[PoissonDistribution[lambda],x]

### Cutoff Points of the Chi-Square Distribution Function

For given probability of the right tail $\alpha$ and degrees of freedom $\nu$, the cut-off value (critical value) $\chi _{\nu,\alpha }^{2}$ is a solution of the equation
$P(\chi _{\nu}^{2}>\chi _{\nu,\alpha }^{2})=\alpha .$
In Excel, use the formula =CHISQ.INV.RT(alpha,v)

In Mathematica, enter InverseCDF[ChiSquareDistribution[v],1-alpha]

### Cutoff Points for the Student’s t Distribution

For given probability of the right tail $\alpha$ and degrees of freedom $\nu$, the cut-off value $t_{\nu,\alpha }$ is a solution of the equation $P(t_{\nu}>t_{\nu,\alpha })=\alpha$.
In Excel, use the formula =T.INV(1-alpha,v)

In Mathematica, enter InverseCDF[StudentTDistribution[v],1-alpha]

### Cutoff Points for the F Distribution

For given probability of the right tail $\alpha$, degrees of freedom $v_1$ (numerator) and $v_2$ (denominator), the cut-off value $F_{v_1,v_2,\alpha }$ is a solution of the equation $P(F_{v_1,v_2}>F_{v_1,v_2,\alpha })=\alpha$.

In Excel, use the formula =F.INV.RT(alpha,v1,v2)

In Mathematica, enter InverseCDF[FRatioDistribution[v1,v2],1-alpha]