Mar 17

Density function properties

A density function can give rise to a distribution function, and conversely, a distribution function can generate a density function. This is the thrust of this post, with an application to binary choice models to follow.

From distribution function to density function

Recall that for any random variable X, its distribution function F_X is defined. As the next definition contains an existence requirement, in general X may not have a density (even when it is continuous).

Definition. We say that a random variable X has a density p_X if there exists an integrable function p_X such that one can find the value of the distribution function by integrating p_X:

(1) F_X(x)=\int_{-\infty}^xp_X(t)dt, for all real x.

When X does have a density, most its values are where the density is the highest. The analog of this observation for the distribution functions looks as follows.

Figure 1. Relationship between density and distribution functions

Most values of X are concentrated where the growth of F_X is the fastest. Figure 1 illustrates this point. The upper pane exhibits the distribution function and the lower pane - the density of the same variable. We consider a bimodal distribution with modes at 50 and 150. The variable takes most its values around these numbers. At the same time, the rate of growth of the distribution function is the fastest around these numbers.


Interval formula. Since the integral is additive, the interval formula in terms of the distribution function implies an interval formula in terms of the density function:

(2) P(a<X\le b)=\int_{-\infty}^bp_X(t)dt-\int_{-\infty}^ap_X(t)dt=\int_a^bp_X(t)dt,

for any -\infty<a<b<\infty.

Integral of density. Letting in (1) x\rightarrow\infty, from Property 3 of distribution functions we see that the density integrates to 1:


Nonnegativity. Since the left side in (2) is nonnegative for any -\infty<a<b<\infty, the density must be nonnegative everywhere.

Rule to find density. The Newton-Leibnitz formula says that the derivative of an integral with respect to the variable upper limit is the integrand evaluated at that limit:


Applying this to (1) we see that if the density exists, it can be found by differentiating the distribution function:


From density function to distribution function

Suppose we have a function p_X(x) which is a) nonnegative and b) integrates to 1. Define a function F_X(x) by (1). Then it will have characteristic Properties 1-3 of a distribution function and therefore will be a distribution function of a random variable X such that

P(a<X\le b)=\int_a^bp_X(t)dt, for any -\infty<a<b<\infty.

Remark. Since P(X=a)=\int_a^ap_X(t)dt=0, in the interval formula we can include the left point without changing the result:

P(a\le X\le b)=\int_a^bp_X(t)dt, for any -\infty<a<b<\infty.