Dec 22

Final exam in Advanced Statistics ST2133, 2022

Final exam in Advanced Statistics ST2133, 2022

Unlike most UoL exams, here I tried to relate the theory to practical issues.

KBTU International School of Economics

Compiled by Kairat Mynbaev

The total for this exam is 41 points. You have two hours.

Everywhere provide detailed explanations. When answering please clearly indicate question numbers. You don’t need a calculator. As long as the formula you provide is correct, the numerical value does not matter.

Question 1. (12 points)

a) (2 points) At a casino, two players are playing on slot machines. Their payoffs X,Y are standard normal and independent. Find the joint density of the payoffs.

b) (4 points) Two other players watch the first two players and start to argue what will be larger: the sum U = X + Y or the difference V = X - Y. Find the joint density. Are variables U,V independent? Find their marginal densities.

c) (2 points) Are U,V normal? Why? What are their means and variances?

d) (2 points) Which probability is larger: P(U > V) or P\left( {U < V} \right)?

e) (2 points) In this context interpret the conditional expectation E\left( {U|V = v} \right). How much is it?

Reminder. The density of a normal variable X \sim N\left( {\mu ,{\sigma ^2}} \right) is {f_X}\left( x \right) = \frac{1}{{\sqrt {2\pi {\sigma ^2}} }}{e^{ - \frac{{{{\left( {x - \mu } \right)}^2}}}{{2{\sigma ^2}}}}}.

Question 2. (9 points) The distribution of a call duration X of one Kcell [largest mobile operator in KZ] customer is exponential: {f_X}\left( x \right) = \lambda {e^{ - \lambda x}},\,\,x \ge 0,\,\,{f_X}\left( x \right) = 0,\,\,x < 0. The number N of customers making calls simultaneously is distributed as Poisson: P\left( {N = n} \right) = {e^{ - \mu }}\frac{{{\mu ^n}}}{{n!}},\,\,n = 0,1,2,... Thus the total call duration for all customers is {S_N} = {X_1} + ... + {X_N} for N \ge 1. We put {S_0} = 0. Assume that customers make their decisions about calling independently.

a) (3 points) Find the general formula (when {X_1},...,{X_n} are identically distributed and X,N are independent but not necessarily exponential and Poisson, as above) for the moment generating function of S_N explaining all steps.

b) (3 points) Find the moment generating functions of X, N and {S_N} for your particular distributions.

c) (3 points) Find the mean and variance of {S_N}. Based on the equations you obtained, can you suggest estimators of parameters \lambda ,\mu ?

Remark. Direct observations on the exponential and Poisson distributions are not available. We have to infer their parameters by observing {S_N}. This explains the importance of the technique used in Question 2.

Question 3. (8 points)

a) (2 points) For a non-negative random variable X prove the Markov inequality P\left( {X > c} \right) \le \frac{1}{c}EX,\,\,\,c > 0.

b) (2 points) Prove the Chebyshev inequality P\left( {|X - EX| > c} \right) \le \frac{1}{c^2}Var\left( X \right) for an arbitrary random variable X.

c) (4 points) We say that the sequence of random variables \left\{ X_n \right\} converges in probability to a random variable X if P\left( {|{X_n} - X| > \varepsilon } \right) \to 0 as n \to \infty for any \varepsilon > 0.  Suppose that E{X_n} = \mu for all n and that Var\left(X_n \right) \to 0 as n \to \infty . Prove that then \left\{X_n\right\} converges in probability to \mu .

Remark. Question 3 leads to the simplest example of a law of large numbers: if \left\{ X_n \right\} are i.i.d. with finite variance, then their sample mean converges to their population mean in probability.

Question 4. (8 points)

a) (4 points) Define a distribution function. Give its properties, with intuitive explanations.

b) (4 points) Is a sum of two distribution functions a distribution function? Is a product of two distribution functions a distribution function?

Remark. The answer for part a) is here and the one for part b) is based on it.

Question 5. (4 points) The Rakhat factory prepares prizes for kids for the upcoming New Year event. Each prize contains one type of chocolates and one type of candies. The chocolates and candies are chosen randomly from two production lines, the total number of items is always 10 and all selections are equally likely.

a) (2 points) What proportion of prepared prizes contains three or more chocolates?

b) (2 points) 100 prizes have been sent to an orphanage. What is the probability that 50 of those prizes contain no more than two chocolates?

Apr 20

FN3142 Chapter 13. Risk management and Value-at-Risk: Models

FN3142 Chapter 13. Risk management and Value-at-Risk: Models

Chapter 13 is divided into 5 parts. For each part, there is a video with the supporting pdf file. Both have been created in Notability using an iPad. All files are here.

Part 1. Distribution function with two examples and generalized inverse function.

Part 2. Value-at-Risk definition

Part 3. Empirical distribution function and its estimation

Part 4. Models based on flexible distributions

Part 5. Semiparametric models, nonparametric estimation of densities and historical simulation.

Besides, in the subchapter named Expected shortfall you can find additional information. It is not in the guide but it was required by one of the past UoL exams.

Apr 18

Solution to Question 1 from UoL exam 2016, Zone B

Solution to Question 1 from UoL exam 2016, Zone B

This problem is a good preparation for Question 2 from UoL exam 2015, Zone A (FN3142), which is more difficult.

Problem statement

Two corporations each have a 4% chance of going bankrupt and the event that one of the two companies will go bankrupt is independent of the event that the other company will go bankrupt. Each company has outstanding bonds. A bond from any of the two companies will return R=0% if the corporation does not go bankrupt, and if it does, a bondholder will lose the face value of the investment, i.e., R=-100%. Suppose an investor buys $1000 worth of bonds of the first corporation, which is then called portfolio P_1, and similarly, an investor buys $1000 worth of bonds of the second corporation, which is then called portfolio P_2.
(a) [40 marks] Calculate the VaR at \alpha=5% critical level for each portfolio and for the joint portfolio P=P_1+P_2.
(b) [30 marks] Is VaR sub-additive in this example? Explain why the absence of sub-additivity may be a concern for risk managers.
(c) [30 marks] The expected shortfall ES^\alpha at the \alpha=5% critical level can be defined as ES^\alpha=E_t[R|R\le VaR^\alpha_{t+1}]. Calculate the expected shortfall for the portfolios P_1,\ P_2 and P. Is this risk measure sub-additive?


There are a couple of general ideas to understand before embarking on calculations. The return on the bond of one company is a binary variable taking values 0% and -100%. All calculations involving it are similar to the ones for the coin. After doing calculations the return figures can be translated to dollar amounts by multiplying by $1000.

While the use of the notions of the distribution function and generalized inverse can be avoided, I prefer to use them to show the general approach.

(a) The return on one bond is described by the table

Table 1. Probability table for return on one bond

Return values Probability
0 0.96
-100 0.04

Therefore its distribution function can be found in the same way as for the coin:

Distribution function for return on one bond

Figure 1. Distribution function for return on one bond

The distribution function is shown in red. It is zero for R<-100, 0.04 for -100\le R<0 and 1 for 0\le R<\infty. The definition of the VaR requires inversion of this function. The graph of this function has flat pieces and its usual inverse does not exist. We have to use the generalized inverse defined by

F^{-1}(y)=\inf\{x:F(x)\ge y\},

see the definition of the infimum here. In our case y=0.05 and the verbal procedure is: 1) find those returns for which F(R)\ge 0.05 (it's the half-axis [0,\infty)) and 2) among them find the least return. The answer is VaR^\alpha=0%. This is the Value at Risk for each of P_1,P_2.

What we do next is very similar to the derivation of the sampling distribution for two coins.

Table 2. Joint probability table for returns on two portfolios

First portfolio
0 -100
Second portfolio 0 0.96^2=0.9216 0.04\times 0.96=0.0384
-100 0.04\times 0.96=0.0384 0.04^2=0.0016

The main body of the table contains probabilities of pairs (R_1,R_2) of the two returns. For the total portfolio the possible return values are 0 (none of the companies goes bankrupt), -50 (one goes bankrupt and the other does not) and -100 (both go bankrupt). The corresponding probabilities follow from Table 2 and we get

Table 3. Probability table for return on the total portfolio

Total return Probabilities
0 0.9216
-50 2\times 0.0384=0.0768
-100 0.0016

This table results in the following distribution function:

Distribution function for return on two bonds

Figure 2. Distribution function for return on two bonds

Since 0.0784>0.05, the Value at Risk is -50% (use the generalized inverse).

(b) Translating the percentages to dollars, at 5% the risk for each of the bonds is $0 and for the total portfolio it is $1000 (50% of $2000; I am passing from negative percentages to positive loss figures).

We say that Value at Risk is sub-additive if VaR_P^\alpha\le Var^\alpha_{P_1}+Var^\alpha_{P_2}. Our calculations show that Value at Risk is not sub-additive in case of independent returns. This has an important practical implication. Suppose that a financial institution has several branches and each of them keeps their Value at Risk, say, at zero. Nevertheless, the Value at Risk for the whole institution may well be large and threaten its stability.

(c) Here we have to apply the definition of the conditional expectation: ES^\alpha_{P_i}=\frac{E_t[R1_{\{R\le VaR^\alpha_{t+1}\}}]}{P(R\le VaR^\alpha_{t+1})}. Since P(R\le 0)=0.96+0.04=1, this is the same as

ES^\alpha_{P_i}=E_t[R1_{\{R\le 0\}}]=0\times 0.96+(-100)\times 0.04=-4%, i=1,2.

For the total portfolio we get

ES^\alpha_P=\frac{E_t[R1_{\{R\le -50\}}]}{P(R\le -50)}=\frac{(-50)\times 0.0768+(-100)\times 0.0016}{0.0768+0.0016}=\frac{-3.84-0.16}{0.0784}=-51.02%.

In monetary terms, this translates (again passing to positive values) to $40 for each bond and to $1020.40 for the total portfolio. The conclusion is that expected shortfall is not sub-additive.


Mar 17

Distribution function properties

The word "distribution" is repeated in elementary Stats texts hundreds of times yet the notion of a distribution function is usually mentioned tangentially or not studied at all. In fact, the distribution function is as important as the density and in binary choice models it is the king. The full name is a cumulative distribution function (cdf) but I am going to stick to the short name (used in advanced texts). This is one of the topics most students don't get on the first attempt (I was not an exception).

Motivating example

Example. Electricity consumption sharply increases when everybody starts using air conditioners, and this happens when temperature exceeds 20\,^{\circ}C. The utility company would like to know the likelihood of a jump in electricity consumption tomorrow at noon.

  1. Consider the probability P(T \le 15) that the temperature tomorrow at noon T will not exceed 15\,^{\circ}C. How does it relate to the probability P(T \le 20)? The second probability is obviously larger, and this can be visualized by comparing the intervals (-\infty,15] and (-\infty,20].
  2. Suppose in the expression P(T \le t) the real number t increases to +\infty. What happens to the probability? As the intervals extend to the right, they eventually include all possible temperatures, and the probability P(T \le t) approaches 1.
  3. Now think about t going to -\infty. Then what happens to P(T \le t)? It's the opposite of the previous case. Eventually, all possible temperatures are excluded, and the probability P(T \le t) goes to 0.


Definition. Let X be a random variable and x a real number. The distribution function F_X of X is defined by F_X(x)=P(X\le x) (the random variable X is fixed and therefore put in the subscript, whereas the real number x changes and serves as the argument).

Distribution function properties

  1. F_X(x) is the probability of the event \{ X\le x\}, so the value F_X(x) always belongs to [0,1].
  2. As the event becomes wider, the probability increases. This property is called monotonicity and is formally written as follows: if x_1\le x_2, then \{X\le x_1\}\subset\{X\le x_2\} and F_X(x_1)\le F_X(x_2).
  3. As x goes to +\infty, the event \{X\le x\} approaches a sure event \{X<+\infty\}=R and F_X(x) tends to 1.
  4. As x goes to -\infty , the event \{X\le x\} approaches an impossible event \{X=-\infty\}=\varnothing and F_X(x) tends to 0.

Figure 1. Distribution function of a normal variable

From this we conclude that the graph of the distribution function may look as in Figure 1.

Interval formula in terms of the distribution function

In many applications we are interested in probability of an event that X takes values in an interval \{a<X\le b\}. Such probability can be expressed in terms of the distribution function. Just apply the additivity rule to the set equation \{-\infty<X\le b\}=\{-\infty<X\le a\}\cup\{a<X\le b\} to get

F_X(b)=F_X(a)+P(a<X\le b)

and, finally,

(1) P(a<X\le b)=F_X(b)-F_X(a).

Definition. Equation (1) can be called an interval formula.