Mar 17

Distribution function properties

The word "distribution" is repeated in elementary Stats texts hundreds of times yet the notion of a distribution function is usually mentioned tangentially or not studied at all. In fact, the distribution function is as important as the density and in binary choice models it is the king. The full name is a cumulative distribution function (cdf) but I am going to stick to the short name (used in advanced texts). This is one of the topics most students don't get on the first attempt (I was not an exception).

Motivating example

Example. Electricity consumption sharply increases when everybody starts using air conditioners, and this happens when temperature exceeds 20\,^{\circ}C. The utility company would like to know the likelihood of a jump in electricity consumption tomorrow at noon.

  1. Consider the probability P(T \le 15) that the temperature tomorrow at noon T will not exceed 15\,^{\circ}C. How does it relate to the probability P(T \le 20)? The second probability is obviously larger, and this can be visualized by comparing the intervals (-\infty,15] and (-\infty,20].
  2. Suppose in the expression P(T \le t) the real number t increases to +\infty. What happens to the probability? As the intervals extend to the right, they eventually include all possible temperatures, and the probability P(T \le t) approaches 1.
  3. Now think about t going to -\infty. Then what happens to P(T \le t)? It's the opposite of the previous case. Eventually, all possible temperatures are excluded, and the probability P(T \le t) goes to 0.


Definition. Let X be a random variable and x a real number. The distribution function F_X of X is defined by F_X(x)=P(X\le x) (the random variable X is fixed and therefore put in the subscript, whereas the real number x changes and serves as the argument).

Distribution function properties

  1. F_X(x) is the probability of the event \{ X\le x\}, so the value F_X(x) always belongs to [0,1].
  2. As the event becomes wider, the probability increases. This property is called monotonicity and is formally written as follows: if x_1\le x_2, then \{X\le x_1\}\subset\{X\le x_2\} and F_X(x_1)\le F_X(x_2).
  3. As x goes to +\infty, the event \{X\le x\} approaches a sure event \{X<+\infty\}=R and F_X(x) tends to 1.
  4. As x goes to -\infty , the event \{X\le x\} approaches an impossible event \{X=-\infty\}=\varnothing and F_X(x) tends to 0.

Figure 1. Distribution function of a normal variable

From this we conclude that the graph of the distribution function may look as in Figure 1.

Interval formula in terms of the distribution function

In many applications we are interested in probability of an event that X takes values in an interval \{a<X\le b\}. Such probability can be expressed in terms of the distribution function. Just apply the additivity rule to the set equation \{-\infty<X\le b\}=\{-\infty<X\le a\}\cup\{a<X\le b\} to get F_X(b)=F_X(a)+P(a<X\le b) and, finally,

(1) P(a<X\le b)=F_X(b)-F_X(a).

Definition. Equation (1) can be called an interval formula.

Leave a Reply

You must be logged in to post a comment.