8
May 16

What is cointegration?

What is cointegration? The discussions here and here  are bad because they link the definition to differencing a time series. In fact, to understand cointegration, you need two notions: stationary processes  (please read before continuing) and linear dependence.

Definition. We say that vectors X_1,...,X_n are linearly dependent if there exist numbers a_1,...,a_n, not all of which are zero, such that the linear combination a_1X_1+...+a_nX_n is a zero vector.

Recall from this post that stationary processes play the role of zero in the set of all processes. Replace in the above definition "vectors" with "processes" and "a zero vector" with "a stationary process" and - voilà - you have the definition of cointegration:

Definition. We say that processes X_1,...,X_n are cointegrated if there exist numbers a_1,...,a_n, not all of which are zero, such that the linear combination a_1X_1+...+a_nX_n is a stationary process. Remembering that each process is a collection of random variables indexed with time moments t, we obtain a definition that explicitly involves time: processes \{X_{1,t}\},...,\{X_{n,t}\} are cointegrated if there exist numbers a_1,...,a_n, not all of which are zero, such that a_1X_{1,t}+...+a_nX_{n,t}=u_t where \{u_t\} is a stationary process.

To fully understand the implications, you need to know all the intricacies of linear dependence. I do not want to plunge into this lengthy discussion here. Instead, I want to explain how this definition leads to a regression in case of two processes.

If \{X_{1,t}\},\{X_{2,t}\} are cointegrated, then there exist numbers a_1,a_2, at least one of which is not zero, such that a_1X_{1,t}+a_2X_{2,t}=u_t where \{u_t\} is a stationary process. If a_1\ne 0, we can solve for X_{1,t} obtaining X_{1,t}=\beta X_{2,t}+v_t with \beta=-a_2/a_1 and v_t=1/a_1u_t. This is almost a regression, except that the mean of v_t may not be zero. We can represent v_t=(v_t-Ev_t)+Ev_t=w_t+\alpha, where \alpha=Ev_t, w_t=v_t-Ev_t. Then the above equation becomes X_{1,t}=\alpha+\beta X_{2,t}+w_t, which is simple regression. The case a_2\ne 0 leads to a similar result.

Practical recommendation. To see if \{X_{1,t}\},\{X_{2,t}\} are cointegrated, regress one of them on the other and test the residuals for stationarity.