24
Oct 22

A problem to do once and never come back

A problem to do once and never come back

There is a problem I gave on the midterm that does not require much imagination. Just know the definitions and do the technical work, so I was hoping we could put this behind us. Turned out we could not and thus you see this post.

Problem. Suppose the joint density of variables X,Y is given by

f_{X,Y}(x,y)=\left\{  \begin{array}{c}k\left( e^{x}+e^{y}\right) \text{ for }0<y<x<1, \\  0\text{ otherwise.}\end{array}\right.

I. Find k.

II. Find marginal densities of X,Y. Are X,Y independent?

III. Find conditional densities f_{X|Y},\ f_{Y|X}.

IV. Find EX,\ EY.

When solving a problem like this, the first thing to do is to give the theory. You may not be able to finish without errors the long calculations but your grade will be determined by the beginning theoretical remarks.

I. Finding the normalizing constant

Any density should satisfy the completeness axiom: the area under the density curve (or in this case the volume under the density surface) must be equal to one: \int \int f_{X,Y}(x,y)dxdy=1. The constant k chosen to satisfy this condition is called a normalizing constant. The integration in general is over the whole plain R^{2} and the first task is to express the above integral as an iterated integral. This is where the domain where the density is not zero should be taken into account. There is little you can do without geometry. One example of how to do this is here.

The shape of the area A=\left\{ (x,y):0<y<x<1\right\} is determined by a) the extreme values of x,y and b) the relationship between them. The extreme values are 0 and 1 for both x and y, meaning that A is contained in the square \left\{ (x,y):0<x,y\text{ and}\ x,y<1\right\} . The inequality y<x means that we cut out of this square the triangle below the line y=x (it is really the lower triangle because if from a point on the line y=x we move down vertically, x will stay the same and y will become smaller than x).

In the iterated integral:

a) the lower and upper limits of integration for the inner integral are the boundaries for the inner variable; they may depend on the outer variable but not on the inner variable.

b) the lower and upper limits of integration for the outer integral are the extreme values for the outer variable; they must be constant.

This is illustrated in Pane A of Figure 1.

Figure 1. Integration order

Figure 1. Integration order

Always take the inner integral in parentheses to show that you are dealing with an iterated integral.

a) In the inner integral integrating over x means moving along blue arrows from the boundary x=y to the boundary x=1. The boundaries may depend on y but not on x because the outer integral is over y.

b) In the outer integral put the extreme values for the outer variable. Thus,

\underset{A}{\int \int }f_{X,Y}(x,y)dxdy=\int_{0}^{1}\left(\int_{y}^{1}f_{X,Y}(x,y)dx\right) dy.

Check that if we first integrate over y (vertically along red arrows, see Pane B in Figure 1) then the equation

\underset{A}{\int \int }f_{X,Y}(x,y)dxdy=\int_{0}^{1}\left(\int_{0}^{x}f_{X,Y}(x,y)dy\right) dx

results.

In fact, from the definition A=\left\{ (x,y):0<y<x<1\right\} one can see that the inner interval for x is \left[ y,1\right] and for y it is \left[ 0,x\right] .

II. Marginal densities

I can't say about this more than I said here.

The condition for independence of X,Y is f_{X,Y}\left( x,y\right)  =f_{X}\left( x\right) f_{Y}\left( y\right) (this is a direct analog of the independence condition for events P\left( A\cap B\right) =P\left( A\right) P\left( B\right) ). In words: the joint density decomposes into a product of individual densities.

III. Conditional densities

In this case the easiest is to recall the definition of conditional probability P\left( A|B\right) =\frac{P\left( A\cap B\right) }{P\left(B\right) }. The definition of conditional densities f_{X|Y},\ f_{Y|X} is quite similar:

(2) f_{X|Y}\left( x|y\right) =\frac{f_{X,Y}\left( x,y\right) }{f_{Y}\left(  y\right) },\ f_{Y|X}\left( y|x\right) =\frac{f_{X,Y}\left( x,y\right) }{f_{X}\left( x\right) }.

Of course, f_{Y}\left( y\right) ,f_{X}\left( x\right) here can be replaced by their marginal equivalents.

IV. Finding expected values of X,Y

The usual definition EX=\int xf_{X}\left( x\right) dx takes an equivalent form using the marginal density:

EX=\int x\left( \int f_{X,Y}\left( x,y\right) dy\right) dx=\int \int  xf_{X,Y}\left( x,y\right) dydx.

Which equation to use is a matter of convenience.

Another replacement in the usual definition gives the definition of conditional expectations:

E\left( X|Y\right) =\int xf_{X|Y}\left( x|y\right) dx, E\left( Y|X\right)  =\int yf_{Y|X}\left( y|x\right) dx.

Note that these are random variables: E\left( X|Y=y\right) depends in y and E\left( Y|X=x\right) depends on x.

Solution to the problem

Being a lazy guy, for the problem this post is about I provide answers found in Mathematica:

I. k=0.581977

II. f_{X}\left( x\right) =-1+e^{x}\left( 1+x\right) , for x\in[ 0,1], f_{Y}\left( y\right) =e-e^{y}y, for y\in \left[ 0,1\right] .

It is readily seen that the independence condition is not satisfied.

III. f_{X|Y}\left( x|y\right) =\frac{k\left( e^{x}+e^{y}\right) }{e-e^{y}y} for 0<y<x<1,

f_{Y|X}\left(y|x\right) =\frac{k\left(e^x+e^y\right) }{-1+e^x\left( 1+x\right) } for 0<y<x<1.

IV. EX=0.709012, EY=0.372965.

13
Apr 19

Checklist for Quantitative Finance FN3142

Checklist for Quantitative Finance FN3142

Students of FN3142 often think that they can get by by picking a few technical tricks. The questions below are mostly about intuition that helps to understand and apply those tricks.

Everywhere we assume that ...,Y_{t-1},Y_t,Y_{t+1},... is a time series and ...,I_{t-1},I_t,I_{t+1},... is a sequence of corresponding information sets. It is natural to assume that I_t\subset I_{t+1} for all t. We use the short conditional expectation notation: E_tX=E(X|I_t).

Questions

Question 1. How do you calculate conditional expectation in practice?

Question 2. How do you explain E_t(E_tX)=E_tX?

Question 3. Simplify each of E_tE_{t+1}X and E_{t+1}E_tX and explain intuitively.

Question 4. \varepsilon _t is a shock at time t. Positive and negative shocks are equally likely. What is your best prediction now for tomorrow's shock? What is your best prediction now for the shock that will happen the day after tomorrow?

Question 5. How and why do you predict Y_{t+1} at time t? What is the conditional mean of your prediction?

Question 6. What is the error of such a prediction? What is its conditional mean?

Question 7. Answer the previous two questions replacing Y_{t+1} by Y_{t+p} .

Question 8. What is the mean-plus-deviation-from-mean representation (conditional version)?

Question 9. How is the representation from Q.8 reflected in variance decomposition?

Question 10. What is a canonical form? State and prove all properties of its parts.

Question 11. Define conditional variance for white noise process and establish its link with the unconditional one.

Question 12. How do you define the conditional density in case of two variables, when one of them serves as the condition? Use it to prove the LIE.

Question 13. Write down the joint distribution function for a) independent observations and b) for serially dependent observations.

Question 14. If one variable is a linear function of another, what is the relationship between their densities?

Question 15. What can you say about the relationship between a,b if f(a)=f(b)? Explain geometrically the definition of the quasi-inverse function.

Answers

Answer 1. Conditional expectation is a complex notion. There are several definitions of differing levels of generality and complexity. See one of them here and another in Answer 12.

The point of this exercise is that any definition requires a lot of information and in practice there is no way to apply any of them to actually calculate conditional expectation. Then why do they juggle conditional expectation in theory? The efficient market hypothesis comes to rescue: it is posited that all observed market data incorporate all available information, and, in particular, stock prices are already conditioned on I_t.

Answers 2 and 3. This is the best explanation I have.

Answer 4. Since positive and negative shocks are equally likely, the best prediction is E_t\varepsilon _{t+1}=0 (I call this equation a martingale condition). Similarly, E_t\varepsilon _{t+2}=0 but in this case I prefer to see an application of the LIE: E_{t}\varepsilon _{t+2}=E_t(E_{t+1}\varepsilon _{t+2})=E_t0=0.

Answer 5. The best prediction is \hat{Y}_{t+1}=E_tY_{t+1} because it minimizes E_t(Y_{t+1}-f(I_t))^2 among all functions f of current information I_t. Formally, you can use the first order condition

\frac{d}{df(I_t)}E_t(Y_{t+1}-f(I_t))^2=-2E_t(Y_{t+1}-f(I_t))=0

to find that f(I_t)=E_tf(I_t)=E_tY_{t+1} is the minimizing function. By the projector property
E_t\hat{Y}_{t+1}=E_tE_tY_{t+1}=E_tY_{t+1}=\hat{Y}_{t+1}.

Answer 6. It is natural to define the prediction error by

\hat{\varepsilon}_{t+1}=Y_{t+1}-\hat{Y}_{t+1}=Y_{t+1}-E_tY_{t+1}.

By the projector property E_t\hat{\varepsilon}_{t+1}=E_tY_{t+1}-E_tY_{t+1}=0.

Answer 7. To generalize, just change the subscripts. For the prediction we have to use two subscripts: the notation \hat{Y}_{t,t+p} means that we are trying to predict what happens at a future date t+p based on info set I_t (time t is like today). Then by definition \hat{Y} _{t,t+p}=E_tY_{t+p}, \hat{\varepsilon}_{t,t+p}=Y_{t+p}-E_tY_{t+p}.

Answer 8. Answer 7, obviously, implies Y_{t+p}=\hat{Y}_{t,t+p}+\hat{\varepsilon}_{t,t+p}. The simple case is here.

Answer 9. See the law of total variance and change it to reflect conditioning on I_t.

Answer 10. See canonical form.

Answer 11. Combine conditional variance definition with white noise definition.

Answer 12. The conditional density is defined similarly to the conditional probability. Let X,Y be two random variables. Denote p_X the density of X and p_{X,Y} the joint density. Then the conditional density of Y conditional on X is defined as p_{Y|X}(y|x)=\frac{p_{X,Y}(x,y)}{p_X(x)}. After this we can define the conditional expectation E(Y|X)=\int yp_{Y|X}(y|x)dy. With these definitions one can prove the Law of Iterated Expectations:

E[E(Y|X)]=\int E(Y|x)p_X(x)dx=\int \left( \int yp_{Y|X}(y|x)dy\right)  p_X(x)dx

=\int \int y\frac{p_{X,Y}(x,y)}{p_X(x)}p_X(x)dxdy=\int \int  yp_{X,Y}(x,y)dxdy=EY.

This is an illustration to Answer 1 and a prelim to Answer 13.

Answer 13. Understanding this answer is essential for Section 8.6 on maximum likelihood of Patton's guide.

a) In case of independent observations X_1,...,X_n the joint density of the vector X=(X_1,...,X_n) is a product of individual densities:

p_X(x_1,...,x_n)=p_{X_1}(x_1)...p_{X_n}(x_n).

b) In the time series context it is natural to assume that the next observation depends on the previous ones, that is, for each t, X_t depends on X_1,...,X_{t-1} (serially dependent observations). Therefore we should work with conditional densities p_{X_1,...,X_t|X_1,...,X_{t-1}}. From Answer 12 we can guess how to make conditional densities appear:

p_{X_1,...,X_n}(x_1,...,x_n)= \frac{p_{X_1,...,X_n}(x_1,...,x_n)}{p_{X_1,...,X_{n-1}}(x_1,...,x_{n-1})} \frac{p_{X_1,...,X_{n-1}}(x_1,...,x_{n-1})}{p_{X_1,...,X_{n-2}}(x_1,...,x_{n-2})}... \frac{p_{X_1,X_2}(x_1,x_2)}{p_{X_1}(x_1)}p_{X_1}(x_1).

The fractions on the right are recognized as conditional probabilities. The resulting expression is pretty awkward:

p_{X_1,...,X_n}(x_1,...,x_n)=p_{X_1,...,X_n|X_1,...,X_n-1}(x_1,...,x_n|x_1,...,x_{n-1})\times

\times p_{X_1,...,X_{n-1}|X_1,...,X_{n-2}}(x_1,...,x_{n-1}|x_1,...,x_{n-2})... \times

p_{X_1,X_2|X_1}(x_1,x_2|x_1)p_{X_1}(x_1).

Answer 14. The answer given here helps one understand how to pass from the density of the standard normal to that of the general normal.

Answer 15. This elementary explanation of the function definition can be used in the fifth grade. Note that conditions sufficient for existence of the inverse are not satisfied in a case as simple as the distribution function of the Bernoulli variable (when the graph of the function has flat pieces and is not continuous). Therefore we need a more general definition of an inverse. Those who think that this question is too abstract can check out UoL exams, where examinees are required to find Value at Risk when the distribution function is a step function. To understand the idea, do the following:

a) Draw a graph of a good function f (continuous and increasing).

b) Fix some value y_0 in the range of this function and identify the region \{y:y\ge y_0\}.

c) Find the solution x_0 of the equation f(x)=y_0. By definition, x_0=f^{-1}(y_o). Identify the region \{x:f(x)\ge y_0\}.

d) Note that x_0=\min\{x:f(x)\ge y_0\}. In general, for bad functions the minimum here may not exist. Therefore minimum is replaced by infimum, which gives us the definition of the quasi-inverse:

x_0=\inf\{x:f(x)\ge y_0\}.