A density function can give rise to a distribution function, and conversely, a distribution function can generate a density function. This is the thrust of this post, with an application to binary choice models to follow.
From distribution function to density function
Recall that for any random variable , its distribution function is defined. As the next definition contains an existence requirement, in general may not have a density (even when it is continuous).
Definition. We say that a random variable has a density if there exists an integrable function such that one can find the value of the distribution function by integrating :
(1) for all real .
When does have a density, most its values are where the density is the highest. The analog of this observation for the distribution functions looks as follows.
Most values of are concentrated where the growth of is the fastest. Figure 1 illustrates this point. The upper pane exhibits the distribution function and the lower pane - the density of the same variable. We consider a bimodal distribution with modes at 50 and 150. The variable takes most its values around these numbers. At the same time, the rate of growth of the distribution function is the fastest around these numbers.
Interval formula. Since the integral is additive, the interval formula in terms of the distribution function implies an interval formula in terms of the density function:
for any .
Integral of density. Letting in (1) , from Property 3 of distribution functions we see that the density integrates to 1:
Nonnegativity. Since the left side in (2) is nonnegative for any , the density must be nonnegative everywhere.
Rule to find density. The Newton-Leibnitz formula says that the derivative of an integral with respect to the variable upper limit is the integrand evaluated at that limit:
Applying this to (1) we see that if the density exists, it can be found by differentiating the distribution function:
From density function to distribution function
Suppose we have a function which is a) nonnegative and b) integrates to 1. Define a function by (1). Then it will have characteristic Properties 1-3 of a distribution function and therefore will be a distribution function of a random variable such that
for any .
Remark. Since in the interval formula we can include the left point without changing the result:
for any .