May 18

Density of a sum of independent variables is given by convolution

Density of a sum of independent variables is given by convolution

This topic is pretty complex because it involves properties of integrals that economists usually don't study. I provide this result to be able to solve one of UoL problems.

General relationship between densities

Let X,Y be two independent variables with densities f_X,f_Y. Denote f_{X,Y} the joint density of the pair (X,Y).

By independence we have

(1) f_{X,Y}(x,y)=f_X(x)f_Y(y).

Let Z=X+Y be the sum and let f_Z,\ F_Z be its density and distribution function, respectively. Then

(2) f_Z(z)=\frac{d}{dz}F_Z(z).

These are the only simple facts in this derivation. By definition,

(3) F_Z(z)=P(Z\le z)=P(X+Y\le z).

For the last probability in (3) we have a double integral

P(X+Y\le z)=\int\int_{x+y\le z}f_{X,Y}(x,y)dxdy.

Using (1), we replace the joint probability by the product of individual probabilities and the double integral by the repeated one:

(4) P(X+Y\le z)=\int\int_{x+y\le z}f_X(x)f_Y(y)dxdy=\int_R\int_{-\infty}^{z-x}f_X(x)f_Y(y)dxdy


The geometry is explained in Figure 1. The area x+y\le z is limited by the line y=z-x. In the repeated integral, we integrate first over red lines from -\infty to z-x and then in the outer integral over all x\in R.

Area of integration

Figure 1. Area of integration

(3) and (4) imply


Finally, using (2) we differentiate both sides to get

(5) f_Z(z)=\int_Rf_X(x)f_Y(z-x)dx.

This is the result. The integral on the right is called a convolution of functions f_X,f_Y.

Remark. Existence of density (2) follows from existence of f_X,f_Y, although we don't prove this fact.

Exercise. Convolution is usually denoted by (f*g)(z)=\int_Rf(x)g(z-x)dx. Prove that

  1. (f*g)(z)=(g*f)(z).

  2. \int_R|(f*g)(z)|dz\le \int_R|f(x)|dx\int_R|g(x)|dx.

  3. If X is uniformly distributed on some segment, then (f_X*f_X)(z) is zero for large z.