21
Jan 23

## Excel for mass education

### Problem statement

The Covid with its lockdowns has posed a difficult question: how do you teach online and preclude cheating by students? How do you do that efficiently with a large number of students and without lowering the teaching standards? I think the answer depends on what you teach. Using Excel made my course very attractive because many students adore learning Excel functions.

### Suggested solution

Last year I taught Financial Econometrics. The topic was Portfolio Optimization using the Sharpe ratio. The idea was to give the students Excel files with individual data sizes so that they have to do the calculations themselves. Those who tried to obtain a file from another student and send it to me under their own names were easily identified. I punished both the giver and receiver of the file. Some steps for assignment preparation and report checking may be very time consuming if you don’t automate them. In the following list, the starred steps are the ones that may take a lot of time with large groups of students.

Step 1. I download data for several stocks from Yahoo Finance and put them in one Excel file where I have the students’ list (Video 1).

Step 2. For each student I randomly choose the sample size for the chunk of data to be selected from the data I downloaded. The students are required to use the whole sample in their calculations.

Step 3*. For creating individual student files with assignments, I use a Visual Basic macro. It reads a student name, his or her sample size, creates an Excel file, pastes there the appropriate sample and saves the file under that student’s name (Video 2).

Step 4*. In Gmail I prepare messages with individual Excel files. Gmail has an option for scheduling emails (Video 3). Outlook.com also has this feature but it requires too many clicks.

Step 5. The test is administered using MS Teams. In the beginning of the test, I give the necessary oral instructions and post the assignment description (which is common for all students). The emails are scheduled to be sent 10 minutes after the session start. The time for the test is just enough to do calculations in Excel. I cannot control how the students do it nor can I see if they share screens to help each other. But I know that the task is difficult enough, so one needs to be familiar with the material in order to accomplish the task, even when one sees on the screen how somebody else is doing it.

Step 6*. Upon completion of the test, the students email me their files. The messages arrival times are recorded by Gmail. I have to check the files and post the grades (video 4).

### Skills to test

Portfolio Optimization involves the following steps.

a) For each stock one has to find daily rates of return.

b) Using arbitrary initial portfolio shares, the daily rates of return on the portfolio are calculated. I require the students to use matrix multiplication for this, which makes checking their work easier.

c) The daily rates of return on the portfolio are used to find the average return, standard deviation and Sharpe ratio for the portfolio. The fact that after all these calculations the students have to obtain a single number also simplifies verification.

d) Finally, the students have to optimize the portfolio shares using the Solver add-in.

The list above is just an example. The task can be expanded to check the knowledge of other elements of matrix algebra, Econometrics and/or Finance. In one of my assignments, I required my students to run a multiple regression. The Excel add-in called Data Analysis allows one to do that easily but my students were required to do everything using the matrix expression for the OLS estimator and also to report the results using Excel string functions.

To make my job easier, I partially or completely automate time-consuming operations. Arguably, everything can be completely automated using Power Automate promoted by Microsoft. Except for the macro, my home-made solutions are simpler.

### Detailed explanations

How to make Gmail your mailto protocol handler

Video 1. Initial file

Video 2. Creating Excel individual files

Video 3. Scheduling emails

Video 4. How to quickly check students work

Macro for creating files

Sub CreateDataFiles()
'
' This needs a file with student names (column A), block sizes (column C)
' and data to choose data blocks from (columns F through M). All on sheet "block finec"
' It creates files with student names and individual data blocks
' If necessary, change edit whatever you want
' Also can change the range address. R1C5 - upper left corner of the data
' "R" & Size & "C13" - lower right corner of the data
' Size is read off column C

' First select the cells with block sizes and then run the macro

' Files will be created and saved with student names
' Keyboard Shortcut: Ctrl+i
'
Application.ScreenUpdating = False
For Each cell In Selection.Cells

Size = cell.Value
Name = cell.Offset(0, -2).Value

Application.Goto Reference:="R1C5:R" & Size & "C13"
Application.CutCopyMode = False
Selection.Copy
ActiveSheet.Paste
ChDir "C:\Users\Student files"
ActiveWorkbook.SaveAs Filename:= _
"C:\Users\Student files\" & Name & ".xlsx", _
FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
ActiveWorkbook.Close

Workbooks("Stat 2 Spring 2022 list with emails.xlsm").Activate

Next
End Sub

2
Oct 22

## Strategies for the crashing market

This year is a wonderful time to short the market. During the pandemic the Fed has been pumping money into the market, and it was clear that the huge rally from March 2020 to December 2021 was nothing but a bubble. It was also obvious that the rally would be reversed by the turn from Quantitative Easing to Quantitative Tightening. Since just about all assets are falling, shorting the market is a very low risk play. The question of timing the trades will not be discussed here (January, March, May and August have been the best entry points). We look into how options can be used for shorting.

When the market crashes, shorting indices (S&P 500, NASDAQ, Dow 30 and Russell 2000) or their proxies (exchange traded funds SPY, QQQ, DIA and IWM) is less risky than shorting individual stocks. That's what I learned (among many other things) from John Carter. Using put options instead of shorting stocks requires less capital. A further reduction in the capital requirement is achieved by using put debit spreads (their effect on buying power is zero).

Denote $p(K)$ the price of a put with a strike $K.$ We know that $p\left(K\right)$ increases with $K$ (you have to pay more for the right to sell at a higher price). A put debit spread strategy has been discussed earlier. It consists of two put options (sorry for the notation change): buy a put $p(K_{1})$ with a higher strike $K_{1}$ and sell a put $p\left( K_{2}\right)$ with a lower strike $K_{2}.$ The initial outlay is $debit=p\left( K_{1}\right) -p\left( K_{2}\right) >0.$ Let $S\left( T\right)$ be the stock price at expiration $T.$ The max profit is $K_{1}-K_{2}-debit$ and it is positive if $K_{1}-debit>K_{2}.$ The max loss is $-debit<0.$

The payoff is illustrated in Figure 1.

Figure 1. Put debit spread on the left, call credit spread on the right

A call credit spread is a strategy consisting of two call options: buy a call $c(K_{1})$ with a higher strike $K_{1}$ and sell a call $c\left(K_{2}\right)$ with a lower strike $K_{2}.$ Since you have to pay more for the right to buy at a lower price, we have $c\left( K_{2}\right) >c(K_{1})$ and you are credited $credit=c\left( K_{2}\right) -c\left( K_{1}\right) >0.$ For the payoff we have 3 cases.

1) Case $S\left( T\right) \leq K_{2}.$ Both calls are out of the money and the max profit is $credit.$

2) Case $K_{2} The $c(K_{2})$ call, being in the money, is exercised by the buyer and you lose $K_{2}-S\left( T\right)<0.$ The $c\left( K_{1}\right)$ is out of the money and expires worthless. Thus the payoff is $credit+K_{2}-S\left( T\right)$ and the break-even stock price is $S\left( T\right) =credit+K_{2}.$

3) Case $S\left( T\right) >K_{1}$. Both calls, being in the money, are exercised. The profit from the long call $c\left( K_{1}\right)$ is $S\left( T\right) -K_{1}$ and the loss from the short call $c(K_{2})$ is $K_{2}-S\left( T\right)$, so the payoff is their sum plus the credit $K_{2}-K_{1}+credit.$ Normally, it is a loss.

## Comparison of the two strategies

The above discussion is summarized in Figure 1. Both strategies can be used if the outlook is bearish. Here we indicate two situations when one is preferred over the other.

Situation 1. Following unexpected bad news, the market falls a lot in one day and it is clear from macroeconomics that it will continue to go down for some time. If prior to the fall there was a healthy rally, it didn't make sense to buy a put. Right after the fall volatility increases, so puts  become expensive. Buying a put debit spread is appropriate because volatilities from the long and short legs of the spread offset each other and one can increase the potential gain by selecting the strikes $K_{1},K_{2}$ further apart.

Situation 2. A different approach is appropriate if a strong one-day fall is expected and any further developments are hard to predict. In this case selling a call credit spread with close expiration is recommended. This allows the investor to take advantage of the time decay. The strikes are selected above the current price $S:$ $S The time value decay for the short call $c\left( K_{2}\right)$ will be greater than for the long call $c\left(K_{1}\right)$ (because the latter has a higher probability of staying out of the money). Therefore the open profit will quickly approach the max profit $credit$ and the spread can be closed out earlier. This allows the investor to capture most of the premium received from placing the trade.

## Mathematical approach to evaluating strategies

This will be explained using call credit spreads

Step 1. The payoff from a long call, neglecting the price $c\left( K\right)$ paid, is $\max \left\{ S\left( T\right) -K,0\right\}$ (if $S\left( T\right) \leq K,$ you throw away the call and get $0;$ if $S\left( T\right) >K,$ you exercise the right to buy the stock and get $S\left( T\right) -K$).

Step 2. What the long party gains, the short party looses, so the payoff from the short position (this time neglecting the credit received) is $-\max \left\{ S\left( T\right) -K,0\right\}$ $=\min \left\{ 0,K-S\left( T\right)\right\} .$

Step 3. Let $K_{2} The payoff from the call credit spread is the sum of the payoffs from the first two steps:

$\max \left\{ S\left( T\right) -K_{1},0\right\} -\max \left\{ S\left( T\right) -K_2,0\right\}.$

Evaluating this expression for different price intervals gives the next table:

 $0$$0$ if $S\left( T\right) \leq K_{2}$S\left( T\right) \leq K_{2} $K_{2}-S\left( T\right)$$K_{2}-S\left( T\right)$ if $K_{2}$K_{2} $K_{2}-K_{1}$$K_{2}-K_{1}$ if $K_{2}$K_{2}

Step 4. Adding the premium received $credit=c\left( K_{2}\right) -c\left(K_{1}\right)$ we get the total payoff

 $credit$$credit$ if $S\left( T\right) \leq K_{2}$S\left( T\right) \leq K_{2} $K_{2}-S\left( T\right) +credit$$K_{2}-S\left( T\right) +credit$ if $K_{2}$K_{2} $K_{2}-K_{1}+credit$$K_{2}-K_{1}+credit$ if $K_{2}$K_{2}

Exercise. For a long put the payoff is $\max \{K-S\left( T\right) ,0\},$ for a short one $-\max \{K-S\left( T\right) ,0\},$ for the strategy with $K_{2} it is $\max \{K_{1}-S\left( T\right) ,0\}$ $-\max \{K_{2}-S\left(T\right) ,0\}-debit,$ where debit is as above.

The pictures have been produced in Mathematica with $K_{1}=390;$ $K_{2}=380.$ I put them side by side for you to better see the difference. The risk-reward ratio is better for the put debit spread (the left chart) than for the call credit spread (the right chart). The latter should be closed earlier and it takes longer for the open profit/loss of the former to approach the max profit. Such strategies could have been used a couple of times on SPY since September 12, 2022.

5
May 22

## Vector autoregression (VAR)

Suppose we are observing two stocks and their respective returns are $x_{t},y_{t}.$ To take into account their interdependence, we consider a vector autoregression

(1) $\left\{\begin{array}{c} x_{t}=a_{1}x_{t-1}+b_{1}y_{t-1}+u_{t} \\ y_{t}=a_{2}x_{t-1}+b_{2}y_{t-1}+v_{t}\end{array}\right.$

Try to repeat for this system the analysis from Section 3.5 (Application to an AR(1) process) of the Guide by A. Patton and you will see that the difficulties are insurmountable. However, matrix algebra allows one to overcome them, with proper adjustment.

### Problem

A) Write this system in a vector format

(2) $Y_{t}=\Phi Y_{t-1}+U_{t}.$

What should be $Y_{t},\Phi ,U_{t}$ in this representation?

B) Assume that the error $U_{t}$ in (1) satisfies

(3) $E_{t-1}U_{t}=0,\ EU_{t}U_{t}^{T}=\Sigma ,~EU_{t}U_{s}^{T}=0$ for $t\neq s$ with some symmetric matrix $\Sigma =\left(\begin{array}{cc}\sigma _{11} & \sigma _{12} \\\sigma _{12} & \sigma _{22} \end{array}\right) .$

What does this assumption mean in terms of the components of $U_{t}$ from (2)? What is $\Sigma$ if the errors in (1) satisfy

(4) $E_{t-1}u_{t}=E_{t-1}v_{t}=0,~Eu_{t}^{2}=Ev_{t}^{2}=\sigma ^{2},$ $Eu_{s}u_{t}=Ev_{s}v_{t}=0$ for $t\neq s,$ $Eu_{s}v_{t}=0$ for all $s,t?$

C) Suppose (1) is stationary. The stationarity condition is expressed in terms of eigenvalues of $\Phi$ but we don't need it. However, we need its implication:

(5) $\det \left( I-\Phi \right) \neq 0$.

Find $\mu =EY_{t}.$

D) Find $Cov(Y_{t-1},U_{t}).$

E) Find $\gamma _{0}\equiv V\left( Y_{t}\right) .$

F) Find $\gamma _{1}=Cov(Y_{t},Y_{t-1}).$

G) Find $\gamma _{2}.$

Solution

A) It takes some practice to see that with the notation

$Y_{t}=\left(\begin{array}{c}x_{t} \\y_{t}\end{array}\right) ,$ $\Phi =\left(\begin{array}{cc} a_{1} & b_{1} \\a_{2} & b_{2}\end{array}\right) ,$ $U_{t}=\left( \begin{array}{c}u_{t} \\v_{t}\end{array}\right)$

the system (1) becomes (2).

B) The equations in (3) look like this:

$E_{t-1}U_{t}=\left(\begin{array}{c}E_{t-1}u_{t} \\ E_{t-1}v_{t}\end{array}\right) =0,$ $EU_{t}U_{t}^{T}=\left( \begin{array}{cc}Eu_{t}^{2} & Eu_{t}v_{t} \\Eu_{t}v_{t} & Ev_{t}^{2} \end{array}\right) =\left(\begin{array}{cc} \sigma _{11} & \sigma _{12} \\ \sigma _{12} & \sigma _{22}\end{array} \right) ,$

$EU_{t}U_{s}^{T}=\left(\begin{array}{cc} Eu_{t}u_{s} & Eu_{t}v_{s} \\Ev_{t}u_{s} & Ev_{t}v_{s} \end{array}\right) =0.$

Equalities of matrices are understood element-wise, so we get a series of scalar equations $E_{t-1}u_{t}=0,...,Ev_{t}v_{s}=0$ for $t\neq s.$

Conversely, the scalar equations from (4) give

$E_{t-1}U_{t}=0,\ EU_{t}U_{t}^{T}=\left(\begin{array}{cc} \sigma ^{2} & 0 \\0 & \sigma ^{2}\end{array} \right) ,~EU_{t}U_{s}^{T}=0$ for $t\neq s$.

C) (2) implies $EY_{t}=\Phi EY_{t-1}+EU_{t}=\Phi EY_{t-1}$ or by stationarity $\mu =\Phi \mu$ or $\left( I-\Phi \right) \mu =0.$ Hence (5) implies $\mu =0.$

D) From (2) we see that $Y_{t-1}$ depends only on $I_{t}$ (information set at time $t$). Therefore by the LIE

$Cov(Y_{t-1},U_{t})=E\left( Y_{t-1}-EY_{t-1}\right) U_{t}^{T}=E\left[ \left( Y_{t-1}-EY_{t-1}\right) E_{t-1}U_{t}^{T}\right] =0,$

$Cov\left( U_{t},Y_{t-1}\right) =\left[ Cov(Y_{t-1},U_{t})\right] ^{T}=0.$

E) Using the previous post

$\gamma _{0}\equiv V\left( \Phi Y_{t-1}+U_{t}\right) =\Phi V\left( Y_{t-1}\right) \Phi ^{T}+Cov\left( U_{t},Y_{t-1}\right) \Phi ^{T}+\Phi Cov(Y_{t-1},U_{t})+V\left( U_{t}\right)$

$=\Phi \gamma _{0}\Phi ^{T}+\Sigma$

(by stationarity and (3)). Thus, $\gamma _{0}-\Phi \gamma _{0}\Phi ^{T}=\Sigma$ and $\gamma _{0}=\sum_{s=0}^{\infty }\Phi ^{s}\Sigma\left( \Phi ^{T}\right) ^{s}$ (see previous post).

F) Using the previous result we have

$\gamma _{1}=Cov(Y_{t},Y_{t-1})=Cov(\Phi Y_{t-1}+U_{t},Y_{t-1})=\Phi Cov(Y_{t-1},Y_{t-1})+Cov(U_{t},Y_{t-1})$

$=\Phi Cov(Y_{t-1},Y_{t-1})=\Phi \gamma _{0}=\Phi \sum_{s=0}^{\infty }\Phi ^{s}\Sigma\left( \Phi ^{T}\right) ^{s}.$

G) Similarly,

$\gamma _{2}=Cov(Y_{t},Y_{t-2})=Cov(\Phi Y_{t-1}+U_{t},Y_{t-2})=\Phi Cov(Y_{t-1},Y_{t-2})+Cov(U_{t},Y_{t-2})$

$=\Phi Cov(Y_{t-1},Y_{t-2})=\Phi \gamma _{1}=\Phi ^{2}\sum_{s=0}^{\infty }\Phi ^{s}\Sigma\left( \Phi ^{T}\right) ^{s}.$

Autocorrelations require a little more effort and I leave them out.

5
May 22

## Vector autoregressions: preliminaries

Suppose we are observing two stocks and their respective returns are $x_{t},y_{t}.$ A vector autoregression for the pair $x_{t},y_{t}$ is one way to take into account their interdependence. This theory is undeservedly omitted from the Guide by A. Patton.

### Required minimum in matrix algebra

Matrix notation and summation are very simple.

Matrix multiplication is a little more complex. Make sure to read Global idea 2 and the compatibility rule.

The general approach to study matrices is to compare them to numbers. Here you see the first big No: matrices do not commute, that is, in general $AB\neq BA.$

The idea behind matrix inversion is pretty simple: we want an analog of the property $a\times \frac{1}{a}=1$ that holds for numbers.

Some facts about determinants have very complicated proofs and it is best to stay away from them. But a couple of ideas should be clear from the very beginning. Determinants are defined only for square matrices. The relationship of determinants to matrix invertibility explains the role of determinants. If $A$ is square, it is invertible if and only if $\det A\neq 0$ (this is an equivalent of the condition $a\neq 0$ for numbers).

Here is an illustration of how determinants are used. Suppose we need to solve the equation $AX=Y$ for $X,$ where $A$ and $Y$ are known. Assuming that $\det A\neq 0$ we can premultiply the equation by $A^{-1}$ to obtain $A^{-1}AX=A^{-1}Y.$ (Because of lack of commutativity, we need to keep the order of the factors). Using intuitive properties $A^{-1}A=I$ and $IX=X$ we obtain the solution: $X=A^{-1}Y.$ In particular, we see that if $\det A\neq 0,$ then the equation $AX=0$ has a unique solution $X=0.$

Let $A$ be a square matrix and let $X,Y$ be two vectors. $A,Y$ are assumed to be known and $X$ is unknown. We want to check that $X=\sum_{s=0}^{\infty }A^{s}Y\left( A^{T}\right) ^{s}$ solves the equation $X-AXA^{T}=Y.$ (Note that for this equation the trick used to solve $AX=Y$ does not work.) Just plug $X:$

$\sum_{s=0}^{\infty }A^{s}Y\left( A^{T}\right) ^{s}-A\sum_{s=0}^{\infty }A^{s}Y\left( A^{T}\right) ^{s}A^{T}$ $=Y+\sum_{s=1}^{\infty }A^{s}Y\left(A^{T}\right) ^{s}-\sum_{s=1}^{\infty }A^{s}Y\left( A^{T}\right) ^{s}=Y$

(write out a couple of first terms in the sums if summation signs frighten you).

Transposition is a geometrically simple operation. We need only the property $\left( AB\right) ^{T}=B^{T}A^{T}.$

### Variance and covariance

Property 1. Variance of a random vector $X$ and covariance of two random vectors $X,Y$ are defined by

$V\left( X\right) =E\left( X-EX\right) \left( X-EX\right) ^{T},$ $Cov\left( X,Y\right) =E\left( X-EX\right) \left( Y-EY\right) ^{T},$

respectively.

Note that when $EX=0,$ variance becomes

$V\left( X\right) =EXX^{T}=\left( \begin{array}{ccc}EX_{1}^{2} & ... & EX_{1}X_{n} \\ ... & ... & ... \\ EX_{1}X_{n} & ... & EX_{n}^{2}\end{array}\right) .$

Property 2. Let $X,Y$ be random vectors and suppose $A,B$ are constant matrices. We want an analog of $V\left( aX+bY\right) =a^{2}V\left( X\right) +2abcov\left( X,Y\right) +b^{2}V\left( X\right) .$ In the next calculation we have to remember that the multiplication order cannot be changed.

$V\left( AX+BY\right) =E\left[ AX+BY-E\left( AX+BY\right) \right] \left[ AX+BY-E\left( AX+BY\right) \right] ^{T}$

$=E\left[ A\left( X-EX\right) +B\left( Y-EY\right) \right] \left[ A\left( X-EX\right) +B\left( Y-EY\right) \right] ^{T}$

$=E\left[ A\left( X-EX\right) \right] \left[ A\left( X-EX\right) \right] ^{T}+E\left[ B\left( Y-EY\right) \right] \left[ A\left( X-EX\right) \right] ^{T}$

$+E\left[ A\left( X-EX\right) \right] \left[ B\left( Y-EY\right) \right] ^{T}+E\left[ B\left( Y-EY\right) \right] \left[ B\left( Y-EY\right) \right] ^{T}$

(applying $\left( AB\right) ^{T}=B^{T}A^{T}$)

$=AE\left( X-EX\right) \left( X-EX\right) ^{T}A^{T}+BE\left( Y-EY\right) \left( X-EX\right) ^{T}A^{T}$

$+AE\left( X-EX\right) \left( Y-EY\right) ^{T}B^{T}+BE\left( Y-EY\right) \left( Y-EY\right) ^{T}B^{T}$

$=AV\left( X\right) A^{T}+BCov\left( Y,X\right) A^{T}+ACov(X,Y)B^{T}+BV\left( Y\right) B^{T}.$

16
Jun 21

## Solution to Question 1 from UoL exam 2020

The assessment was an open-book take-home online assessment with a 24-hour window. No attempt was made to prevent cheating, except a warning, which was pretty realistic. Before an exam it's a good idea to see my checklist.

Question 1. Consider the following ARMA(1,1) process:

(1) $z_{t}=\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta \varepsilon _{t-1}$

where $\varepsilon _{t}$ is a zero-mean white noise process with variance $\sigma ^{2}$, and assume $|\alpha |,|\theta |<1$ and $\alpha+\theta \neq 0$, which together make sure $z_{t}$ is covariance stationary.

(a) [20 marks] Calculate the conditional and unconditional means of $z_{t}$, that is, $E_{t-1}[z_{t}]$ and $E[z_{t}].$

(b) [20 marks] Set $\alpha =0$. Derive the autocovariance and autocorrelation function of this process for all lags as functions of the parameters $\theta$ and $\sigma$.

(c) [30 marks] Assume now $\alpha \neq 0$. Calculate the conditional and unconditional variances of $z_{t},$ that is, $Var_{t-1}[z_{t}]$ and $Var[z_{t}].$

Hint: for the unconditional variance, you might want to start by deriving the unconditional covariance between the variable and the innovation term, i.e., $Cov[z_{t},\varepsilon _{t}].$

(d) [30 marks] Derive the autocovariance and autocorrelation for lags of 1 and 2 as functions of the parameters of the model.

Hint: use the hint of part (c).

## Solution

### Part (a)

Reminder: The definition of a zero-mean white noise process is

(2) $E\varepsilon _{t}=0,$ $Var(\varepsilon _{t})=E\varepsilon_{t}^{2}=\sigma ^{2}$ for all $t$ and $Cov(\varepsilon _{j},\varepsilon_{i})=E\varepsilon _{j}\varepsilon _{i}=0$ for all $i\neq j.$

A variable indexed $t-1$ is known at moment $t-1$ and at all later moments and behaves like a constant for conditioning at such moments.

Moment $t$ is future relative to $t-1.$  The future is unpredictable and the best guess about the future error is zero.

The recurrent relationship in (1) shows that

(3) $z_{t-1}=\gamma +\alpha z_{t-2}+...$ does not depend on the information that arrives at time $t$ and later.

Hence, using also linearity of conditional means,

(4) $E_{t-1}z_{t}=E_{t-1}\gamma +\alpha E_{t-1}z_{t-1}+E_{t-1}\varepsilon _{t}+\theta E_{t-1}\varepsilon _{t-1}=\gamma +\alpha z_{t-1}+\theta\varepsilon _{t-1}.$

The law of iterated expectations (LIE): application of $E_{t-1},$ based on information available at time $t-1,$ and subsequent application of $E,$ based on no information, gives the same result as application of $E.$

$Ez_{t}=E[E_{t-1}z_{t}]=E\gamma +\alpha Ez_{t-1}+\theta E\varepsilon _{t-1}=\gamma +\alpha Ez_{t-1}.$

Since $z_{t}$ is covariance stationary, its means across times are the same, so $Ez_{t}=\gamma +\alpha Ez_{t}$ and $Ez_{t}=\frac{\gamma }{1-\alpha }.$

### Part (b)

With $\alpha =0$ we get $z_{t}=\gamma +\varepsilon _{t}+\theta\varepsilon _{t-1}$ and from part (a) $Ez_{t}=\gamma .$ Using (2), we find variance

$Var(z_{t})=E(z_{t}-Ez_{t})^{2}=E(\varepsilon _{t}^{2}+2\theta \varepsilon_{t}\varepsilon _{t-1}+\theta ^{2}\varepsilon _{t-2}^{2})=(1+\theta^{2})\sigma ^{2}$

and first autocovariance

(5) $\gamma_{1}=Cov(z_{t},z_{t-1})=E(z_{t}-Ez_{t})(z_{t-1}-Ez_{t-1})=E(\varepsilon_{t}+\theta \varepsilon _{t-1})(\varepsilon _{t-1}+\theta \varepsilon_{t-2})=\theta E\varepsilon _{t-1}^{2}=\theta \sigma ^{2}.$

Second and higher autocovariances are zero because the subscripts of epsilons don't overlap.

Autocorrelation function: $\rho _{0}=\frac{Cov(z_{t},z_{t})}{\sqrt{Var(z_{t})Var(z_{t})}}=1$ (this is always true),

$\rho _{1}=\frac{Cov(z_{t},z_{t-1})}{\sqrt{Var(z_{t})Var(z_{t-1})}}=\frac{\theta \sigma ^{2}}{(1+\theta ^{2})\sigma ^{2}}=\frac{\theta }{1+\theta ^{2}},$ $\rho _{j}=0$ for $j>1.$

This is characteristic of MA processes: their autocorrelations are zero starting from some point.

### Part (c)

If we replace all expectations in the definition of variance, we obtain the definition of conditional variance. From (1) and (4)

$Var_{t-1}(z_{t})=E_{t-1}(z_{t}-E_{t-1}z_{t})^{2}=E_{t-1}\varepsilon_{t}^{2}=\sigma ^{2}.$

By the law of total variance

(6) $Var(z_{t})=EVar_{t-1}(z_{t})+Var(E_{t-1}z_{t})=\sigma ^{2}+Var(\gamma+\alpha z_{t-1}+\theta \varepsilon _{t-1})=$

(an additive constant does not affect variance)

$=\sigma ^{2}+Var(\alpha z_{t-1}+\theta \varepsilon _{t-1})=\sigma^{2}+\alpha ^{2}Var(z_{t})+2\alpha \theta Cov(z_{t-1},\varepsilon_{t-1})+\theta ^{2}Var(\varepsilon _{t-1}).$

By the LIE and (3)

$Cov(z_{t-1},\varepsilon _{t-1})=Cov(\gamma +\alpha z_{t-2}+\varepsilon _{t-1}+\theta \varepsilon _{t-2},\varepsilon _{t-1})=\alpha Cov(z_{t-2},\varepsilon _{t-1})+E\varepsilon _{t-1}^{2}+\theta EE_{t-2}\varepsilon _{t-2}\varepsilon _{t-1}=\sigma ^{2}+\theta E(\varepsilon _{t-2}E_{t-2}\varepsilon _{t-1}).$

Here $E_{t-2}\varepsilon _{t-1}=0,$ so

(7) $Cov(z_{t-1},\varepsilon _{t-1})=\sigma ^{2}.$

$Var(z_{t})=Var(\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta \varepsilon _{t-1})=\alpha ^{2}Var(z_{t-1})+Var(\varepsilon _{t})+\theta ^{2}Var(\varepsilon _{t-1})+$

$+2\alpha Cov(z_{t-1},\varepsilon _{t})+2\alpha \theta Cov(z_{t-1},\varepsilon _{t-1})+2\theta Cov(\varepsilon _{t},\varepsilon _{t-1})=\alpha ^{2}Var(z_{t})+\sigma ^{2}+\theta ^{2}\sigma ^{2}+2\alpha \theta \sigma ^{2}$

and, finally,

(8) $Var(z_{t})=\frac{(1+2\alpha \theta +\theta ^{2})\sigma ^{2}}{1-\alpha ^{2}}.$

### Part (d)

From (7)

(9) $Cov(z_{t-1},\varepsilon _{t-2})=Cov(\gamma +\alpha z_{t-2}+\varepsilon _{t-1}+\theta \varepsilon _{t-2},\varepsilon _{t-2})=\alpha Cov(z_{t-2},\varepsilon _{t-2})+\theta Var(\varepsilon _{t-2})=(\alpha +\theta )\sigma ^{2}.$

It follows that

$Cov(z_{t},z_{t-1})=Cov(\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta \varepsilon _{t-1},\gamma +\alpha z_{t-2}+\varepsilon _{t-1}+\theta \varepsilon _{t-2})=$

(a constant is not correlated with anything)

$=\alpha ^{2}Cov(z_{t-1},z_{t-2})+\alpha Cov(z_{t-1},\varepsilon _{t-1})+\alpha \theta Cov(z_{t-1},\varepsilon _{t-2})+$

$+\alpha Cov(\varepsilon _{t},z_{t-2})+Cov(\varepsilon _{t},\varepsilon _{t-1})+\theta Cov(\varepsilon _{t},\varepsilon _{t-2})+$

$+\theta \alpha Cov(\varepsilon _{t-1},z_{t-2})+\theta Var(\varepsilon _{t-1})+\theta ^{2}Cov(\varepsilon _{t-1},\varepsilon _{t-2}).$

From (7) $Cov(z_{t-2},\varepsilon _{t-2})=\sigma ^{2}$ and from (9) $Cov(z_{t-1},\varepsilon _{t-2})=(\alpha +\theta )\sigma ^{2}.$

From (3) $Cov(\varepsilon _{t},z_{t-2})=Cov(\varepsilon _{t-1},z_{t-2})=0.$

Using also the white noise properties and stationarity of $z_{t}$

$Cov(z_{t},z_{t-1})=Cov(z_{t-1},z_{t-2})=\gamma _{1},$

we are left with

$\gamma _{1}=\alpha ^{2}\gamma _{1}+\alpha \sigma ^{2}+\alpha \theta (\alpha +\theta )\sigma ^{2}+\theta \sigma ^{2}=\alpha ^{2}\gamma _{1}+(1+\alpha \theta )(\alpha +\theta )\sigma ^{2}.$

Hence,

$\gamma _{1}=\frac{(1+\alpha \theta )(\alpha +\theta )\sigma ^{2}}{1-\alpha ^{2}}$

and using (8)

$\rho _{0}=1,$ $\rho _{1}=\frac{(1+\alpha \theta )(\alpha +\theta )}{ 1+2\alpha \theta +\theta ^{2}}.$

The finish is close.

$Cov(z_{t},z_{t-2})=Cov(\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta \varepsilon _{t-1},\gamma +\alpha z_{t-3}+\varepsilon _{t-2}+\theta \varepsilon _{t-3})=$

$=\alpha ^{2}Cov(z_{t-1},z_{t-3})+\alpha Cov(z_{t-1},\varepsilon _{t-2})+\alpha \theta Cov(z_{t-1},\varepsilon _{t-3})+$

$+\alpha Cov(\varepsilon _{t},z_{t-3})+Cov(\varepsilon _{t},\varepsilon _{t-2})+\theta Cov(\varepsilon _{t},\varepsilon _{t-3})+$

$+\theta \alpha Cov(\varepsilon _{t-1},z_{t-3})+\theta Cov(\varepsilon _{t-1},\varepsilon _{t-2})+\theta ^{2}Cov(\varepsilon _{t-1},\varepsilon _{t-3}).$

This simplifies to

(10) $Cov(z_{t},z_{t-2})=\alpha ^{2}Cov(z_{t-1},z_{t-3})+\alpha (\alpha +\theta )\sigma ^{2}+\alpha \theta Cov(z_{t-1},\varepsilon _{t-3}).$

By (7)

$Cov(z_{t-1},\varepsilon _{t-3})=Cov(\gamma +\alpha z_{t-2}+\varepsilon _{t-1}+\theta \varepsilon _{t-2},\varepsilon _{t-3})=\alpha Cov(z_{t-2},\varepsilon _{t-3})=$

$=\alpha Cov(\gamma +\alpha z_{t-3}+\varepsilon _{t-2}+\theta \varepsilon _{t-3},\varepsilon _{t-3})=\alpha \sigma ^{2}+\alpha \theta \sigma ^{2}=\alpha (1+\theta )\sigma ^{2}.$

Finally, using (10)

$\gamma _{2}=\alpha ^{2}\gamma _{2}+\alpha (\alpha +\theta )\sigma ^{2}+\alpha^2 \theta (1 +\theta )\sigma ^{2}=\alpha ^{2}\gamma _{2}+\alpha\sigma^2 (\alpha +\theta +\alpha\theta +\alpha\theta^2)\sigma ^{2},$

$\gamma _{2}=\frac{\alpha\sigma^2 (\alpha +\theta +\alpha\theta +\alpha\theta^2)\sigma ^{2}}{1-\alpha ^{2}},$

$\rho _{2}=\frac{\alpha\sigma^2 (\alpha +\theta +\alpha\theta +\alpha\theta^2)}{1+2\alpha \theta +\theta ^{2}}.$

A couple of errors have been corrected on June 22, 2021. Hope this is final.

20
Apr 21

This post parallels the one about the call debit spread. A combination of several options in one trade is called a strategy. Here we discuss a strategy called a put debit spread. The word "debit" in this name means that a trader has to pay for it. The rule of thumb is that if it is a debit (you pay for a strategy), then it is less risky than if it is a credit (you are paid). Let $p(K)$ denote the price of the put with the strike $K,$ suppressing all other variables that influence the put price.

Assumption. The market values higher events of higher probability. This is true if investors are rational and the market correctly reconciles views of different investors.

We need the following property: if $K_{1} are two strike prices, then for the corresponding put prices (with the same expiration and underlying asset) one has $p(K_{1})

Proof.  A put price is higher if the probability of it being in the money at expiration is higher. Let $S(T)$ be the stock price at expiration $T.$ Since $T$ is a moment in the future, $S(T)$ is a random variable. For a given strike $K,$ the put is said to be in the money at expiration if $S(T) If $K_{1} and $S(T) then $S(T) It follows that the set $\{ S(T) is a subset of the set $\{S(T) Hence the probability of the event $\{S(T) is higher than that of the event $\{S(T) and $p(K_{2})>p(K_{1}).$

Put debit spread strategy. Select two strikes $K_{1} buy $p(K_{2})$ (take a long position) and sell $p(K_{1})$ (take a short position). You pay $p=p(K_{2})-p(K_{1})>0$ for this.

Our purpose is to derive the payoff for this strategy. We remember that if $S(T)\ge K,$ then the put $p(K)$ expires worthless.

Case $S(T)\ge K_{2}.$ In this case both options expire worthless and the payoff is the initial outlay: payoff $=-p.$

Case $K_{1}\leq S(T) Exercising the put $p(K_{2})$, in comparison with selling the stock at the market price you gain $K_{2}-S(T).$ The second option expires worthless. The payoff is: payoff $=K_{2}-S(T)-p.$

Case $S(T) Both options are exercised. The gain from $p(K_{2})$ is, as above, $K_{2}-S(T).$ The holder of the long put $p(K_{1})$ sells you stock at price $K_{1}.$ Since your position is short, you have nothing to do but comply. The alternative would be to buy at the market price, so you lose $S(T)-K_{1}.$ The payoff is: payoff $=\left(K_{2}-S(T)\right) +\left( S(T)-K_{1}\right) -p=K_{2}-K_{1}-p.$

Summarizing, we get:

payoff $=\left\{\begin{array}{ll} -p, & K_2\le S(T) \\ K_{2}-S(T)-p, & K_{1}\leq S(T)

Normally, the strikes are chosen so that $K_{2}-K_{1}>p.$ From the payoff expression we see then that the maximum profit is $K_{2}-K_{1}-p>0,$ the maximum loss is $-p$ and the breakeven stock price is $S(T)=K_{2}-p.$ This is illustrated in Figure 1, where the stock price at expiration is on the horizontal axis.

Figure 1. Payoff from put debit spread. Source: https://www.optionsbro.com/

Conclusion. For the strategy to be profitable, the price at expiration should satisfy $S(T)< K_{2}-p.$ Buying a put debit spread is appropriate when the price is expected to stay in that range.

In comparison with the long put position $p(K_{2}),$ taking at the same time the short call position $-p(K_{1})$ allows one to reduce the initial outlay. This is especially important when the stock volatility is high, resulting in a high put price. In the difference $p(K_{2})-p(K_{1})$ that volatility component partially cancels out.

Remark. There is an important issue of choosing the strikes. Let $S$ denote the stock price now. The payoff expression allows us to rank the next choices in the order of increasing risk: 1) $S (both options are in the money, less risk), 2) $K_1 and 3) $K_1 (both options are out of the money, highest risk).  Also remember that a put debit spread is less expensive than buying $p(K_{2})$ and selling $p(K_{1})$ in two separate transactions.

Exercise. Analyze a put credit spread, in which you sell $p(K_{2})$ and buy $p(K_{1})$.

21
Mar 21

A combination of several options in one trade is called a strategy. Here we discuss a strategy called a call debit spread. The word "debit" in this name means that a trader has to pay for it. The rule of thumb is that if it is a debit (you pay for a strategy), then it is less risky than if it is a credit (you are paid). Let $c(K)$ denote the call price with the strike $K,$ suppressing all other variables that influence the call price.

Assumption. The market values higher events of higher probability. This is true if investors are rational and the market correctly reconciles views of different investors.

We need the following property: if $K_{1} are two strike prices, then for the corresponding call prices (with the same expiration and underlying asset) one has $c(K_{1})>c(K_{2}).$

Proof.  A call price is higher if the probability of it being in the money at expiration is higher. Let $S(T)$ be the stock price at expiration $T.$ Since $T$ is a moment in the future, $S(T)$ is a random variable. For a given strike $K,$ the call is said to be in the money at expiration if $S(T)>K.$ If $K_{1} and $S(T)>K_{2},$ then $S(T)>K_{1}.$ It follows that the set $\{ S(T)>K_{2}\}$ is a subset of the set $\{S(T)>K_{1}\} .$ Hence the probability of the event $\{S(T)>K_{2}\}$ is lower than that of the event $\{S(T)>K_{1}\}$ and $c(K_{1})>c(K_{2}).$

Call debit spread strategy. Select two strikes $K_{1} buy $c(K_{1})$ (take a long position) and sell $c(K_{2})$ (take a short position). You pay $p=c(K_{1})-c(K_{2})>0$ for this.

Our purpose is to derive the payoff for this strategy. We remember that if $S(T)\leq K,$ then the call $c(K)$ expires worthless.

Case $S(T)\leq K_{1}.$ In this case both options expire worthless and the payoff is the initial outlay: payoff $=-p.$

Case $K_{1} Exercising the call $c(K_{1})$ and immediately selling the stock at the market price you gain $S(T)-K_{1}.$ The second option expires worthless. The payoff is: payoff $=S(T)-K_{1}-p.$ (In fact, you are assigned stock and selling it is up to you).

Case $K_{2} Both options are exercised. The gain from $c(K_{1})$ is, as above, $S(T)-K_{1}.$ The holder of the long call $c(K_{2})$ buys from you at price $K_{2}.$ Since your position is short, you have nothing to do but comply. You buy at $S(T)$ and sell at $K_{2}.$ Thus the loss from $-c(K_{2})$ is $K_{2}-S(T).$ The payoff is: payoff $=\left(S(T)-K_{1}\right) +\left( K_{2}-S(T)\right) -p=K_{2}-K_{1}-p.$

Summarizing, we get:

payoff $=\left\{\begin{array}{ll} -p, & S(T)\leq K_{1} \\ S(T)-K_{1}-p, & K_{1}

Normally, the strikes are chosen so that $K_{2}-K_{1}>p.$ From the payoff expression we see then that the maximum profit is $K_{2}-K_{1}-p>0,$ the maximum loss is $-p$ and the breakeven stock price is $S(T)=K_{1}+p.$ This is illustrated in Figure 1, where the stock price at expiration is on the horizontal axis.

Figure 1. Payoff for call debit strategy. Source: https://www.optionsbro.com/

Conclusion. For the strategy to be profitable, the price at expiration should satisfy $S(T)\geq K_{1}+p.$ Buying a call debit spread is appropriate when the price is expected to stay in that range.

In comparison with the long call position $c(K_{1}),$ taking at the same time the short call position $-c(K_{2})$ allows one to reduce the initial outlay. This is especially important when the stock volatility is high, resulting in a high call price. In the difference $c(K_{1})-c(K_{2})$ that volatility component partially cancels out.

Remark. There is an important issue of choosing the strikes. Let $S$ denote the stock price now. The payoff expression allows us to rank the next choices in the order of increasing risk: 1) $K_1 (both options are in the money, less risk), 2) $K_1 and 3) $K_1 (both options are out of the money, highest risk).  Also remember that a call debit spread is less expensive than buying $c(K_{1})$ and selling $c(K_{2})$ in two separate transactions.

Exercise. Analyze a call credit spread, in which you sell $c(K_{1})$ and buy $c(K_{2})$.

24
Jun 20

## Solution to Question 2 from UoL exam 2018, Zone B

There are three companies, called A, B, and C, and each has a 4% chance of going bankrupt. The event that one of the three companies will go bankrupt is independent of the event that any other company will go bankrupt.

Company A has outstanding bonds, and a bond will have a net return of $r = 0\%$ if the corporation does not go bankrupt, but it will have a net return of $r = -100\%$, i.e., losing everything invested, if it goes bankrupt. Suppose an investor buys $1000 worth of bonds of company A, which we will refer to as portfolio ${P_1}$. Suppose also that there exists a security whose payout depends on the bankruptcy of companies B and C in a joint fashion. In particular, if neither B nor C go bankrupt, this derivative will have a net return of $r = 0\%$. If exactly one of B or C go bankrupt, it will have a net return of $r = -50\%$, i.e., losing half of the investment. If both B and C go bankrupt, it will have a net return of $r = -100\%$, i.e., losing the whole investment. Suppose an investor buys$1000 worth of this derivative, which is then called portfolio ${P_2}$.

(a) Calculate the VaR at the $\alpha = 10\%$ critical level for portfolios $P_1$ and ${P_2}$. [30 marks]

Independence of events. Denote $A,{A^c}$ the events that company A goes bankrupt and does not go bankrupt, resp. A similar notation will be used for the other two companies. The simple definition of independence of bankruptcy events $P(A \cap B) = P(A)P(B)$ would be too difficult to apply to prove independence of all events that we need. A general definition of independence of variables is that their sigma-fields are independent (it will not be explained here). This general definition implies that in all cases below we can use multiplicativity of probability such as

$P(B \cap C) = P(B)P(C) = {0.04^2} = 0.0016,\,\,P({B^c} \cap {C^c}) = {0.96^2} = 0.9216,$ $P((B \cap {C^c}) \cup ({B^c} \cap C)) = P(B \cap {C^c}) + P({B^c} \cap C) = 2 \times 0.04 \times 0.96 = 0.0768.$

The events here have a simple interpretation: the first is that “both B and C fail”, the second is “both B and C fail”, and the third is that “either (B fails and C does not) or (B does not fail and C does)” (they do not intersect and additivity of probability applies).

Let ${r_A},{r_S}$ be returns on A and the security S, resp. From the problem statement it follows that these returns are described by the tables
Table 1

 ${r_A}$${r_A}$ Prob 0 0.96 -100 0.04

Table 2

 ${r_S}$${r_S}$ Prob 0 0.9216 -50 0.0768 -100 0.0016

Everywhere we will be working with percentages, so the dollar values don’t matter.

From Table 1 we conclude that the distribution function of return on A looks as follows:

Figure 1. Distribution function of portfolio A

At $x=-100$ the function jumps up by 0.04, at $x=0$ by another 0.96. The dashed line at $y=0.1$ is used in the definition of the VaR using the generalized inverse:

$VaR_A^{0.1} = \inf \{ {x:{F_A}(x) \ge 0.1}\} = 0.$

From Table 2 we see that the distribution function of return on S looks like this:

The first jump is at $x=-100$, the second at $x=-50$ and third one at $x=0$. As above, it follows that

$VaR_S^{0.1} = \inf\{ {x:{F_S}(x) \ge 0.1}\} = 0.$

(b) Calculate the VaR at the $\alpha=10\%$ critical level for the joint portfolio ${P_1} + {P_2}$. [20 marks]

To find the return distribution for $P_1 + P_2$, we have to consider all pairs of events from Tables 1 and 2 using independence.

1.$P({r_A}=0,{r_S}=0)=0.96\times 0.9216=0.884736$

2.$P({r_A}=-100,{r_S}=0)=0.04\times 0.9216=0.036864$

3.$P({r_A}=0,{r_S}=-50)=0.96\times 0.0768=0.073728$

4.$P({r_A}=-100,{r_S}=-50)=0.04\times 0.0768=0.003072$

5.$P({r_A}=0,{r_S}=-100)=0.96\times 0.0016=0.001536$

6.$P({r_A}=-100,{r_S}=-100)=0.04\times 0.0016=0.000064$

Since we deal with a joint portfolio, percentages for separate portfolios should be translated into ones for the whole portfolio. For example, the loss of 100% on one portfolio and 0% on the other means 50% on the joint portfolio (investments are equal). There are two such losses, in lines 2 and 5, so the probabilities should be added. Thus, we obtain the table for the return $r$ on the joint portfolio:

Table 3

 $r$$r$ Prob 0 0.884736 -25 0.073728 -50 0.0384 -75 0.003072 -100 0.000064

Here only the first probability exceeds 0.1, so the definition of the generalized inverse gives

$VaR_r^{0.1} = \inf \{ {x:{F_r}(x) \ge 0.1}\} = 0.$

(c) Is VaR sub-additive in this example? Explain why the absence of sub-additivity may be a concern for risk managers. [20 marks]

To check sub-additivity, we need to pass to positive numbers, as explained in other posts. Zeros remain zeros, the inequality $0 \le 0 + 0$ is true, so sub-additivity holds in this example. Lack of sub-additivity is an undesirable property for risk managers, because for them keeping the VaR at low levels for portfolio parts doesn’t mean having low VaR for the whole portfolio.

(d) The expected shortfall $E{S^\alpha }$ at the $\alpha$ critical level can be defined as

$ES^\alpha= - E_t[R|R < - VaR_{t + 1}^\alpha],$

where $R$ is a return or dollar amount. Calculate the expected shortfall at the $\alpha = 10\%$ critical level for portfolio $P_2$. Is this risk measure sub-additive? [30 marks]

Using the definition of conditional expectation and Table 3, we have (the time subscript can be omitted because the problem is static)
$ES^{0.1}=-E[r|r
$=-\frac{-25\times 0.073728-50\times 0.0384-75\times 0.003072-100\times 0.000064}{0.073728+0.0384+0.003072+0.000064}=\frac{4}{0.115264}=34.7029.$

There is a theoretical property that the expected shortfall is sub-additive.

22
Jun 20

## Solution to Question 2 from UoL exam 2019, zone B

Suppose the parameters in a GARCH (1,1) model

$\sigma _{t + 1}^2 = \omega + \beta \sigma _t^2 + \alpha \varepsilon _t^2$   (1)

are $\omega = 0.000004,\ \alpha = 0.06,\ \beta = 0.93$, the index $t$ refers to days and ${\varepsilon _t}$ is zero-mean white noise with conditional variance $\sigma _t^2$.

(a) What are the requirements for this process to be covariance stationary, and are they satisfied here? [20 marks]

If the coefficients satisfy the condition for positivity, $\omega>0,\ \alpha,\beta\ge0$, then the condition for covariance-stationarity is $\alpha + \beta < 1$. They are barely satisfied.

(b) What is the long-run average volatility? [20 marks]

We use the facts that ${\sigma ^2} = E\sigma _{t + 1}^2 = E\left[ {E(\varepsilon _{t + 1}^2|{F_t})} \right]$ for all t. Applying the unconditional mean to regression (1) and using the LIE we get

${\sigma ^2}=E\sigma _{t+1}^2=E\left[{\omega+\beta\sigma _t^2+\alpha\varepsilon _t^2}\right]=\omega+\beta{\sigma^2}+\alpha{\sigma^2}$

and

${\sigma^2}=\frac{\omega }{{1-\alpha-\beta}}=\frac{{0.000004}}{{1-0.06-0.93}}=0.0004$.

(c) If the current volatility is 2.5% per day, what is your estimate of the volatility in 20, 40, and 60 days? [20 marks]

On p.107 of the Guide there is the derivation of the equation

$\sigma _{t + h,t}^2 = \sigma _y^2 + {(\alpha + \beta )^{h - 1}}(\sigma _{t + 1,t}^2 - \sigma _y^2),\,\,h \ge 1.$    (2)

I gave you a slightly easier derivation in my class, please use that one. If we interpret "current" as $t+1$ and "in twenty days" as $t+1+20$, then

$\sigma _{t+21}^2=\sigma^2+(\alpha + \beta )^{20}(\sigma _{t+1}^2-\sigma^2)$ $= 0.0004+\exp\left[ 20\ln(0.06+0.93)\right](0.025-0.0004) = 0.020521.$

For $h=41,61$ use the same formula to get 0.016692, 0.013725, resp. I did it in Excel and don't envy you if you have to do it during an exam.

(d) Suppose that there is an event that decreases the current volatility by 1.5%to 1% per day. Estimate the effect on the volatility in 20, 40, and 60 days. [20 marks]

Calculations are the same, just replace 0.025 by 0.01. Alternatively, one can see that the previous values will go down by $\exp[(h-1)\ln(0.06+0.93)]0.015$, which results in volatility values 0.012146, 0.009934 and 0.008125.

(e) Explain what volatility should be used to price 20-, 40-and 60-day options, and explain how you would calculate the values. [20 marks]

The only unobservable input to the Black-Scholes option pricing formula is the stock price volatility. In the derivation of the formula the volatility is assumed to be constant. The value of the constant should depend on the forecast horizon. If we, say, forecast 20 days ahead, we should use a constant value for all 20 days. This constant can be obtained as an average of daily forecasts obtained from the GARCH model.

If the GARCH is not used, a simpler approach is applied. If the average daily volatility is ${\sigma _d}$, then assuming independent returns, over a period of $n$ days volatility is ${\sigma _{nd}} = \sqrt n {\sigma _d}$.

In practice, traders go back from option prices to volatility. That is, they use observed option prices to solve the Black-Scholes formula for volatility (find the root of an equation with the price given). The resulting value is called implied volatility. If it is plugged back into the Black-Scholes formula, the observed option price will result.

21
Jun 20

## Solution to Question 3 from UoL exam 2019, zone A

(a) Define the concept of trade duration in financial markets and explain briefly why this concept is economically useful. What features do trade durations typically exhibit and how can we model these features? [25 marks]

High frequency traders (HFT) may trade every millisecond. Orders from traders arrive at random moments and therefore the trade times are not evenly spaced. It makes sense to model the differences

${x_j} = TIME_j - TIME_{j - 1}$

between transaction times. (The Guide talks about differences between times of returns but I don’t like this because on small time frames people are interested in prices, not returns.) Those differences are called durations. They are economically interesting because 1) they tell us something about liquidity: periods of intense trading are generally periods of greater market liquidity than periods of sparse trading (there is also after-hours trading between 16:00 and 20:30, New York time, when trading may be intense but liquidity is low) and 2) durations relate directly to news arrivals and the adjustment of prices to news, and so have some use in discussions of market efficiency.

The trading session in the USA is from 9:30 to 16:00, New York time. Durations exhibit diurnality (that is, intraday seasonality): transactions are more frequent (durations are shorter) in the first and last hour of the trading session and less frequent around lunch, see Figure 16.6 from the Guide.

Higher frequency in the first hour results from traders rebalancing their portfolios after overnight news and in the last hour – from anticipation of news during the next night.

The main decomposition of durations is

${x_j} = {s_j}x_j^*,$
so $\log {x_j} = \log {s_j} + \log x_j^*,$
$\log {s_j} = \sum\limits_{i = 1}^{13} {\beta _j}{D_{i,j}}$
$\log {s_j} = \gamma _0 + \gamma _1HR{S_j} + \gamma _2HRS_j^2.$

In the first equation ${s_j}$ is the diurnal component and $x_j^*$ is called a de-seasonalized duration (it has not been defined here). The second follows from the first.

I am not sure that you need the third equation. The fourth equation is used below. In the third equation $\log {s_j}$ is regressed on dummies of half-hour periods (there are 13 of them in the trading session; the constant is not included to avoid the dummy trap). In the fourth equation it is regressed on the first and second power of the time variable $HRS_j$, which measures time in hours starting from the previous midnight. This is called a polynomial regression. Both regressions can capture diurnality.

(b) Describe the Engle and Russell (1998) autoregressive conditional duration (ACD) model. [25 marks]

Instead of the duration model considered in part (a) Engle and Russell suggest the ACD model

${x_j} = {s_j}x_j^*,$
$x_j^* = {\psi _j}{\varepsilon _j},$
where $\log {s_j} = {\beta _0} + {\gamma _1}HR{S_j} + {\beta _2}HRS_j^2,$
$x_j^\star = \frac{{x_j}}{{s_j}},\ {\varepsilon _j}|{F_t}\sim i.i.d.(1),$
$\psi _j = \omega + \beta \psi _{j - 1} + \alpha x_{j - 1}^*,\,\,\omega > 0,\,\,\alpha ,\beta \ge 0$   (1)

The first decomposition is the same as above. The second equation decomposes the de-seasonalized duration into a product of deterministic and stochastic components. To understand the idea, we can compare (1) with the GARCH(1,1) model:

$y_{t + 1} = {\mu _{t + 1}} + {\varepsilon _{t + 1}},$
${\varepsilon _{t + 1}} = {\sigma _{t + 1}}{v_{t + 1}},$
${v_{t + 1}}|{F_t}\sim F(0,1),$
$\sigma _{t + 1}^2 = \omega + \beta \sigma _t^2 + \alpha \varepsilon _t^2.$   (2)

Equations (1) and (2) are similar. The assumptions about the random components are different: in (1) we have $E{\varepsilon _j} = 1$, in (2) $E{v_j} = 0$. This is because in (2) the epsilons are deviations from the mean and may change sign; in (1) the epsilons come from durations and should be positive. To obtain the last equation in (1) from the GARCH(1,1) in (2) one has to make replacements

$\sigma _t^2\sim {\psi _j},\ \varepsilon _t^2 = {({y_{t + 1}} - {\mu _{t + 1}})^2}\sim x_{j - 1}^*$. (3)

This is important to know, to understand the comparison of the ML method for the two models below.

(c) Compare the conditions for covariance stationarity, identification and positivity of the duration series for the ACD(1,1) to those for the GARCH(1,1). [25 marks]

Those conditions for GARCH are

Condition 1: $\omega > 0,\ \alpha,\beta \ge 0,$ for positive variance,

Condition 2: $\beta = 0$ if $\alpha = 0,$ for identification,

Condition 3: $\alpha + \beta < 1,$ for covariance stationarity.

For ACD they are the same, because both are essentially ARMA models.

(d) Illustrate the relationship between the log-likelihood of the ACD(1,1) model and the estimation of a GARCH(1,1) model using the normal likelihood function. [25 marks]

Because of the assumption $E{\varepsilon _j} = 1$ in (1) we cannot use the normal distribution for (1). Instead the exponential random variable is used. It takes only positive values; its density is zero on the left half-axis and is an exponential function on the right half-axis:

$Z\sim Exponential(\gamma ),$

so $f(z|\gamma ) = \frac{1}{\gamma }\exp\left(-\frac{z}{\gamma } \right)$ and $EZ = \gamma .$

Here $\gamma$ is a positive number and $f$ is the density. We take $\gamma= 1$ as required by the ACD model. This implies $Ex_j^* = {\psi _j}E{\varepsilon _j} = {\psi _j}$ so $x_j^*$ is distributed as $Exponential({\psi _j}).$ Its density is

$f(x_j^*|\psi _j)=f(x_j^*|\psi _j(\omega,\beta,\alpha))=\frac{1}{\psi _j}f\left(-\frac{x_j^*}{\psi _j}\right).$

The rest is logical: plug $\psi_j$ from the ACD model (1), then take log and then add those logs to obtain the log-likelihood. A. Patton gives the log-likelihood for GARCH, whose derivation I could not find in the book. But from (3) we know that there should be similarity after replacement $\sigma _t^2\sim{\psi _j},\ x_{j - 1}^*\sim{({r_t} - {\mu _t})^2}$. To this Patton adds that the GARCH likelihood is simply a linear transformation of the ACD likelihood.