Jun 21

Solution to Question 1 from UoL exam 2020

Solution to Question 1 from UoL exam 2020

The assessment was an open-book take-home online assessment with a 24-hour window. No attempt was made to prevent cheating, except a warning, which was pretty realistic. Before an exam it's a good idea to see my checklist.

Question 1. Consider the following ARMA(1,1) process:

(1) z_{t}=\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta \varepsilon _{t-1}

where \varepsilon _{t} is a zero-mean white noise process with variance \sigma ^{2}, and assume |\alpha |,|\theta |<1 and \alpha+\theta \neq 0, which together make sure z_{t} is covariance stationary.

(a) [20 marks] Calculate the conditional and unconditional means of z_{t}, that is, E_{t-1}[z_{t}] and E[z_{t}].

(b) [20 marks] Set \alpha =0. Derive the autocovariance and autocorrelation function of this process for all lags as functions of the parameters \theta and \sigma .

(c) [30 marks] Assume now \alpha \neq 0. Calculate the conditional and unconditional variances of z_{t}, that is, Var_{t-1}[z_{t}] and Var[z_{t}].

Hint: for the unconditional variance, you might want to start by deriving the unconditional covariance between the variable and the innovation term, i.e., Cov[z_{t},\varepsilon _{t}].

(d) [30 marks] Derive the autocovariance and autocorrelation for lags of 1 and 2 as functions of the parameters of the model.

Hint: use the hint of part (c).


Part (a)

Reminder: The definition of a zero-mean white noise process is

(2) E\varepsilon _{t}=0, Var(\varepsilon _{t})=E\varepsilon_{t}^{2}=\sigma ^{2} for all t and Cov(\varepsilon _{j},\varepsilon_{i})=E\varepsilon _{j}\varepsilon _{i}=0 for all i\neq j.

A variable indexed t-1 is known at moment t-1 and at all later moments and behaves like a constant for conditioning at such moments.

Moment t is future relative to t-1.  The future is unpredictable and the best guess about the future error is zero.

The recurrent relationship in (1) shows that

(3) z_{t-1}=\gamma +\alpha z_{t-2}+... does not depend on the information that arrives at time t and later.

Hence, using also linearity of conditional means,

(4) E_{t-1}z_{t}=E_{t-1}\gamma +\alpha E_{t-1}z_{t-1}+E_{t-1}\varepsilon _{t}+\theta E_{t-1}\varepsilon _{t-1}=\gamma +\alpha z_{t-1}+\theta\varepsilon _{t-1}.

The law of iterated expectations (LIE): application of E_{t-1}, based on information available at time t-1, and subsequent application of E, based on no information, gives the same result as application of E.

Ez_{t}=E[E_{t-1}z_{t}]=E\gamma +\alpha Ez_{t-1}+\theta E\varepsilon _{t-1}=\gamma +\alpha Ez_{t-1}.

Since z_{t} is covariance stationary, its means across times are the same, so Ez_{t}=\gamma +\alpha Ez_{t} and Ez_{t}=\frac{\gamma }{1-\alpha }.

Part (b)

With \alpha =0 we get z_{t}=\gamma +\varepsilon _{t}+\theta\varepsilon _{t-1} and from part (a) Ez_{t}=\gamma . Using (2), we find variance

Var(z_{t})=E(z_{t}-Ez_{t})^{2}=E(\varepsilon _{t}^{2}+2\theta \varepsilon_{t}\varepsilon _{t-1}+\theta ^{2}\varepsilon _{t-2}^{2})=(1+\theta^{2})\sigma ^{2}

and first autocovariance

(5) \gamma_{1}=Cov(z_{t},z_{t-1})=E(z_{t}-Ez_{t})(z_{t-1}-Ez_{t-1})=E(\varepsilon_{t}+\theta \varepsilon _{t-1})(\varepsilon _{t-1}+\theta \varepsilon_{t-2})=\theta E\varepsilon _{t-1}^{2}=\theta \sigma ^{2}.

Second and higher autocovariances are zero because the subscripts of epsilons don't overlap.

Autocorrelation function: \rho _{0}=\frac{Cov(z_{t},z_{t})}{\sqrt{Var(z_{t})Var(z_{t})}}=1 (this is always true),

\rho _{1}=\frac{Cov(z_{t},z_{t-1})}{\sqrt{Var(z_{t})Var(z_{t-1})}}=\frac{\theta \sigma ^{2}}{(1+\theta ^{2})\sigma ^{2}}=\frac{\theta }{1+\theta ^{2}}, \rho _{j}=0 for j>1.

This is characteristic of MA processes: their autocorrelations are zero starting from some point.

Part (c)

If we replace all expectations in the definition of variance, we obtain the definition of conditional variance. From (1) and (4)

Var_{t-1}(z_{t})=E_{t-1}(z_{t}-E_{t-1}z_{t})^{2}=E_{t-1}\varepsilon_{t}^{2}=\sigma ^{2}.

By the law of total variance

(6) Var(z_{t})=EVar_{t-1}(z_{t})+Var(E_{t-1}z_{t})=\sigma ^{2}+Var(\gamma+\alpha z_{t-1}+\theta \varepsilon _{t-1})=

(an additive constant does not affect variance)

=\sigma ^{2}+Var(\alpha z_{t-1}+\theta \varepsilon _{t-1})=\sigma^{2}+\alpha ^{2}Var(z_{t})+2\alpha \theta Cov(z_{t-1},\varepsilon_{t-1})+\theta ^{2}Var(\varepsilon _{t-1}).

By the LIE and (3)

Cov(z_{t-1},\varepsilon _{t-1})=Cov(\gamma +\alpha z_{t-2}+\varepsilon  _{t-1}+\theta \varepsilon _{t-2},\varepsilon _{t-1})=\alpha  Cov(z_{t-2},\varepsilon _{t-1})+E\varepsilon _{t-1}^{2}+\theta  EE_{t-2}\varepsilon _{t-2}\varepsilon _{t-1}=\sigma ^{2}+\theta  E(\varepsilon _{t-2}E_{t-2}\varepsilon _{t-1}).

Here E_{t-2}\varepsilon _{t-1}=0, so

(7) Cov(z_{t-1},\varepsilon _{t-1})=\sigma ^{2}.

This equation leads to

Var(z_{t})=Var(\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta \varepsilon  _{t-1})=\alpha ^{2}Var(z_{t-1})+Var(\varepsilon _{t})+\theta  ^{2}Var(\varepsilon _{t-1})+

+2\alpha Cov(z_{t-1},\varepsilon _{t})+2\alpha \theta  Cov(z_{t-1},\varepsilon _{t-1})+2\theta Cov(\varepsilon _{t},\varepsilon  _{t-1})=\alpha ^{2}Var(z_{t})+\sigma ^{2}+\theta ^{2}\sigma ^{2}+2\alpha  \theta \sigma ^{2}

and, finally,

(8) Var(z_{t})=\frac{(1+2\alpha \theta +\theta ^{2})\sigma ^{2}}{1-\alpha  ^{2}}.

Part (d)

From (7)

(9) Cov(z_{t-1},\varepsilon _{t-2})=Cov(\gamma +\alpha z_{t-2}+\varepsilon  _{t-1}+\theta \varepsilon _{t-2},\varepsilon _{t-2})=\alpha  Cov(z_{t-2},\varepsilon _{t-2})+\theta Var(\varepsilon _{t-2})=(\alpha  +\theta )\sigma ^{2}.

It follows that

Cov(z_{t},z_{t-1})=Cov(\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta  \varepsilon _{t-1},\gamma +\alpha z_{t-2}+\varepsilon _{t-1}+\theta  \varepsilon _{t-2})=

(a constant is not correlated with anything)

=\alpha ^{2}Cov(z_{t-1},z_{t-2})+\alpha Cov(z_{t-1},\varepsilon  _{t-1})+\alpha \theta Cov(z_{t-1},\varepsilon _{t-2})+

+\alpha Cov(\varepsilon _{t},z_{t-2})+Cov(\varepsilon _{t},\varepsilon  _{t-1})+\theta Cov(\varepsilon _{t},\varepsilon _{t-2})+

+\theta \alpha Cov(\varepsilon _{t-1},z_{t-2})+\theta Var(\varepsilon  _{t-1})+\theta ^{2}Cov(\varepsilon _{t-1},\varepsilon _{t-2}).

From (7) Cov(z_{t-2},\varepsilon _{t-2})=\sigma ^{2} and from (9) Cov(z_{t-1},\varepsilon _{t-2})=(\alpha +\theta )\sigma ^{2}.

From (3) Cov(\varepsilon _{t},z_{t-2})=Cov(\varepsilon _{t-1},z_{t-2})=0.

Using also the white noise properties and stationarity of z_{t}

Cov(z_{t},z_{t-1})=Cov(z_{t-1},z_{t-2})=\gamma _{1},

we are left with

\gamma _{1}=\alpha ^{2}\gamma _{1}+\alpha \sigma  ^{2}+\alpha \theta (\alpha +\theta )\sigma ^{2}+\theta \sigma ^{2}=\alpha  ^{2}\gamma _{1}+(1+\alpha \theta )(\alpha +\theta )\sigma ^{2}.


\gamma _{1}=\frac{(1+\alpha \theta )(\alpha +\theta )\sigma ^{2}}{1-\alpha  ^{2}}

and using (8)

\rho _{0}=1, \rho _{1}=\frac{(1+\alpha \theta )(\alpha +\theta )}{  1+2\alpha \theta +\theta ^{2}}.

The finish is close.

Cov(z_{t},z_{t-2})=Cov(\gamma +\alpha z_{t-1}+\varepsilon _{t}+\theta  \varepsilon _{t-1},\gamma +\alpha z_{t-3}+\varepsilon _{t-2}+\theta  \varepsilon _{t-3})=

=\alpha ^{2}Cov(z_{t-1},z_{t-3})+\alpha Cov(z_{t-1},\varepsilon  _{t-2})+\alpha \theta Cov(z_{t-1},\varepsilon _{t-3})+

+\alpha Cov(\varepsilon _{t},z_{t-3})+Cov(\varepsilon _{t},\varepsilon  _{t-2})+\theta Cov(\varepsilon _{t},\varepsilon _{t-3})+

+\theta \alpha Cov(\varepsilon _{t-1},z_{t-3})+\theta Cov(\varepsilon  _{t-1},\varepsilon _{t-2})+\theta ^{2}Cov(\varepsilon _{t-1},\varepsilon  _{t-3}).

This simplifies to

(10) Cov(z_{t},z_{t-2})=\alpha ^{2}Cov(z_{t-1},z_{t-3})+\alpha (\alpha  +\theta )\sigma ^{2}+\alpha \theta Cov(z_{t-1},\varepsilon _{t-3}).

By (7)

Cov(z_{t-1},\varepsilon _{t-3})=Cov(\gamma +\alpha z_{t-2}+\varepsilon  _{t-1}+\theta \varepsilon _{t-2},\varepsilon _{t-3})=\alpha  Cov(z_{t-2},\varepsilon _{t-3})=

=\alpha Cov(\gamma +\alpha z_{t-3}+\varepsilon _{t-2}+\theta \varepsilon  _{t-3},\varepsilon _{t-3})=\alpha \sigma ^{2}+\alpha \theta \sigma ^{2}=\alpha (1+\theta )\sigma ^{2}.

Finally, using (10)

\gamma _{2}=\alpha ^{2}\gamma _{2}+\alpha (\alpha +\theta )\sigma  ^{2}+\alpha^2 \theta (1 +\theta )\sigma ^{2}=\alpha ^{2}\gamma  _{2}+\alpha\sigma^2 (\alpha +\theta +\alpha\theta +\alpha\theta^2)\sigma ^{2},

\gamma _{2}=\frac{\alpha\sigma^2 (\alpha +\theta +\alpha\theta +\alpha\theta^2)\sigma ^{2}}{1-\alpha  ^{2}},

\rho _{2}=\frac{\alpha\sigma^2 (\alpha +\theta +\alpha\theta +\alpha\theta^2)}{1+2\alpha \theta  +\theta ^{2}}.

A couple of errors have been corrected on June 22, 2021. Hope this is final.

May 17

Stationary processes 1

Along with examples of nonstationary processes, it is necessary to know a couple of examples of stationary processes.

Example 1. In the model with a time trend, suppose that there is no time trend, that is, b=0. The result is white noise shifted by a constant a, and it is seen to be stationary.

Example 2. Let us change the random walk slightly, by introducing a coefficient \beta for the first lag:

(1) y_t=\beta y_{t-1}+u_t

where u_t is, as before, white noise:

(2) Eu_t=0Eu_t^2=\sigma^2 for all t and Eu_tu_s=0 for all t\ne s.

This is an autoregressive process of order 1, denoted AR(1).

Stability condition|\beta|<1.

By now you should be familiar with recurrent substitution. (1) for the previous period looks like this:

(3) y_{t-1}=\beta y_{t-2}+u_{t-1}.

Plugging (3) in (1) we get y_t=\beta^2y_{t-2}+\beta u_{t-1}+u_t. After doing this k times we obtain

(4) y_t=\beta^ky_{t-k}+\beta^{k-1}u_{t-k+1}+...+\beta u_{t-1}+u_t.

To avoid errors in calculations like this, note that in the product \beta^{k-1}u_{t-k+1} the sum of the power of \beta and the subscript of u is always t.

Here the range of time moments didn't matter because the model wasn't dynamic. In the other example we had to assume that in (1) t takes all positive integer values. In the current situation we have to assume that t takes all integer values, or, put it differently, the process y_t extends infinitely to plus and minus infinity. Then we can take advantage of the stability condition. Letting k\rightarrow\infty (and therefore t-k\rightarrow-\infty) we see that the first term on the right-hand side of (4) tends to zero and the sum becomes infinite:

(5) y_t=...+\beta^{k-1}u_{t-k+1}+...+\beta u_{t-1}+u_t=\sum_{j=0}^\infty\beta^ju_{t-j}.

We have shown that this representation follows from (1). Conversely, one can show that (5) implies (1). (5) is an infinite moving average, denoted MA(\infty).

It can be used to check that (1) is stationary. Obviously, the first condition of a stationary process is satisfied: Ey_t=0. For the second one we have (use (2)):

(6) Var(y_t)=Ey_t^2=E(...+\beta^{k-1}u_{t-k+1}+...+\beta u_{t-1}+u_t)(...+\beta^{k-1}u_{t-k+1}+...+\beta u_{t-1}+u_t)


which doesn't depend on t.

Exercise. To make sure that you understand (6), similarly prove that

(7) Cov(y_t,y_s)=\beta^{|t-s|}\frac{\sigma^2}{1-\beta^2}.

Without loss of generality, you can assume that t>s. (7) is a function of the distance in time between t,s, as required.

May 16

What is a stationary process?

What is a stationary process? More precisely, this post is about a discrete weakly stationary process. This topic is not exactly a beginners Stats, I am posting this to help those who study Econometrics using Introduction to Econometrics, by Christopher Dougherty, published by Oxford University Press, UK, in 2016.

Point of view. At discrete moments in time t, we observe some random variables X_tX_t can be, for example, periodical temperature measurements in a certain location. You can imagine a straight line, with moments t labeled on it, and for each t, some variable X_t attached to it. In general, X_t may have different distributions and in theory time moments may extend infinitely to the left and right.

Definition. We say that the collection \{X_t\} is (weakly) stationary if it satisfies three conditions:

  1. The means EX_t are constant (that is, do not depend on t),
  2. The variances Var(X_t) are also constant (same thing, they do not depend on t), and
  3. The covariances Cov(X_t,X_s)=f(|t-s|) depend only on the distance in time between two moments t,s.

Regarding the last condition, recall the visualization of the process, with random variables sticking out of points in time, and the fact that the distance between two moments t,s is given by the absolute value |t-s|. The condition Cov(X_t,X_s)=f(|t-s|) says that the covariance between X_t,X_s is some (unspecified) function of this distance. It should not depend on any of the moments t,s themselves.

If you want a complex definition to stay in your memory, you have to chew and digest it. The best thing to do is to prove a couple of properties.

Main property. A sum of two independent stationary processes is also stationary.

Proof. The assumption is that each variable in the collection \{X_t\} is independent of each variable in the collection \{Y_t\}. We need to check that \{X_t+Y_t\} satisfies the definition of a stationary process.

Obviously, E(X_t+Y_t)=EX_t+EY_t is constant.

Similarly, by independence we have Var(X_t+Y_t)=Var(X_t)+Var(Y_t), so variance of the sum is constant.

Finally, using properties of covariance,


(two terms disappear by independence)


(each covariance depends only on |t-s|, so their sum depends only on |t-s|).

Conclusion. You certainly know that 0+0=0. The above property is similar to this:

stationary process + stationary process = stationary process

(under independence). Now you can understand the role of stationary processes in the set of all processes: they play the role of zero. That is to say, the process \{X_t\} is not very different from the process \{Y_t\} if their difference is stationary.

Generalization. Any linear combination of independent stationary processes is stationary.