Statistics and Algorithms / Vector Autoregressions /

Cointegration and Error Correction Models

The Cointegration Testing topic discusses how to test for cointegration. Here, we describe how to impose it on a (VAR) model. The standard VAR model

\begin{equation} {\bf{y}}_t = \sum\limits_{s = 1}^L {\Phi _s {\bf{y}}_{t - s} } + {\bf{u}}_t \label{eq:var_lagrep} \end{equation}

can always be rewritten in the form

\begin{equation} \Delta {\bf{y}}_t = \sum\limits_{s = 1}^{L - 1} {\Phi _s^* \Delta {\bf{y}}_{t - s} } + \Pi {\bf{y}}_{t - 1} + {\bf{u}}_t \label{eq:var_cointvar} \end{equation}

where \(\Delta {\bf{y}}_t \) is the first difference of the \(\bf{y}\) vector (that is, \({\bf{y}}_t - {\bf{y}}_{t - 1} \)). If \(\Pi\) is zero, then the VAR can be modeled adequately in first differences. If \(\Pi\) is not zero, then, even if each component of \(\bf{y}\) has a unit root, a VAR in first differences is misspecified.

If \(\Pi\) is full-rank, there is nothing to be gained by writing the system in the form \eqref{eq:var_cointvar} rather than \eqref{eq:var_lagrep}: the two are equivalent. The interesting case is where \(\Pi\) is non-zero but less than full rank. In that case, we can write the matrix as

\begin{equation} \Pi = \alpha \beta ' \label{eq:var_cointpidef} \end{equation}

where \(\alpha\) and \(\beta\) are \(N \times r\) matrices (\(N\) is the number of components in \(\bf{y}\) and \(r\) is the rank of \(\Pi\)). Note that the decomposition in \eqref{eq:var_cointpidef} isn't unique: for any \(r \times r\) non-singular matrix \(\bf{G}\), it is also true that

\begin{equation} \Pi = \left( {\alpha {\kern 1pt} {\bf{G}}} \right)\left( {\beta {\kern 1pt} {\bf{G'}}^{ - 1} } \right)^\prime \label{eq:var_cointbetaunid} \end{equation}

Where \(r\) is one, however, \eqref{eq:var_cointbetaunid} is unique up to a scale factor in the two parts.

The instruction ECT (added to the SYSTEM definition) allows you to estimate the VAR in the form \eqref{eq:var_lagrep} imposing a reduced rank assumption on the \(\Pi\) term. The \(\beta '{\kern 1pt} {\bf{y}} \) represent stationary linear combinations of the \({\bf{y}}\) variables. If the \({\bf{y}}\)’s themselves are non-stationary, they are said to be cointegrated. The instruction name (which is short for Error Correction Terms) comes from the fact that \eqref{eq:var_cointvar} is sometimes called the error correction form of the VAR. See, for instance, Chapter 6 of Enders (2014) or Chapter 19 of Hamilton (1994).

Setting up the VAR

To estimate a VAR in error correction form, start out as if you were doing the standard form \eqref{eq:var_lagrep}. That is, your variables aren't differenced, and you count the number of lags you want on the undifferenced variables. Create the error correction equations, and use ECT to add them to the system. You don't have to fill in the coefficients when setting up the VAR, but you at least need to have created the equations.

The VAR system works somewhat differently in error correction form. The regression output, coefficient vectors, and covariance matrices will be based upon the restricted form. When the MODEL is used in a forecasting instruction such as IMPULSE or FORECAST, it is substituted back into form \eqref{eq:var_lagrep}, so that it analyzes the original variables and not the differenced ones.

In general, you will use ESTIMATE to compute the coefficients for the model. You can use KALMAN to estimate the parameters sequentially, but the Kalman filter does not apply to the cointegrating vector \(\beta\)—it takes \(\beta\) as given by the coefficients in the error correction equation(s), and estimates only the \(\Phi _s^* \) and \(\alpha\). If you are estimating these coefficients yourself (other than the least squares conditional on \(\beta\) that ESTIMATE does), use the function %MODELSETCOEFFS(model,coeffs). The coefficient matrix coeffs is a RECTANGULAR matrix. Each column gives the coefficients for an equation. In a given equation, the first coefficients are the \(\Phi _s^* \). Note that there are \(L-1\) lags for each endogenous variable, not \(L\), since these equations are in differenced form. Next are the variables listed on the DETERMINISTIC instruction, if any. The final coefficients are the loadings \(\alpha\) on the error correction terms.

Setting the Cointegrating Vectors

The cointegrating vectors \(\beta\) can either be set from theory or estimated. If you want to set them from theory, use the EQUATION instruction to create the equation, and use the COEFFS option on EQUATION or the %EQNSETCOEFFS function to set the coefficients. For instance, in the COINTTST.RPF xample, we test whether Purchasing Power Parity is a cointegrating relationship. If (despite the results to the contrary in this dataset), we were to proceed with imposing this during estimation, we would do the following:

equation(coeffs=||1.0,-1.0,-1.0||) rexeq rexrate

# pusa s pita

system(model=pppmodel)

variables pusa s pita

lags 1 to 12

det constant

ect rexeq

end(system)

estimate

In this case, the error correction equation shows the desired linear combination of the endogenous variables. If, instead, the coefficients of the equilibrium equation are estimated by an auxiliary regression, the error correction equation will have one of the endogenous variables as its dependent variable. For the example above, you would just replace the EQUATION instruction (and supplementary card) with:

linreg(define=rexeq) pusa

# s pita

If you’re estimating the cointegrating vector using a procedure which doesn't normalize a coefficient to one (such as @JOHMLE), define the equation without a dependent variable:

equation(coeffs=cvector) rexeq *

# pusa s pita

Error Correction Model (Example ECT.RPF)

The example file ECT.RPF analyzes a set of three interest rate variables, first testing for cointegration, then imposing it. It demonstrates several ways to estimate the cointegrating vector.

The simplest is to run an “Engle-Granger” regression (one of the yields on the other two). While the estimates from that are “super-consistent” if the series are cointegrated, there are other ways to get better smaller sample performance. The example uses the @JOHMLE procedure, which does maximum likelihood estimation. Other procedures for estimating a cointegrating vector or vectors are @FM (does fully-modfied least squares) and @SWDOLS (does dynamic OLS). These are all included in the Time Series—Cointegration Estimation wizard.

Because the series don’t have a trend, the appropriate choice for the deterministic component on @JOHMLE is DET=RC, which doesn’t include a constant in the individual equations (where it would cause the series to drift because of the unit roots), but restricts it to the cointegrating vector. The components of the cointegrating vector produced by @JOHMLE with this choice will have four components: the three variables + the constant. If we had used the default DET=CONSTANT, it would only be a 3-vector, but we would also include DET CONSTANT in the SYSTEM definition. Be careful with your choice for the DET option—it’s a common error to select the wrong one. Note that there is no reason that the error correction term(s) will have mean zero—the requirement is that they be stationary, which includes processes with non-zero means.

@johmle(lags=6,det=rc,cv=cvector)

# ftbs3 ftb12 fcm7

equation(coeffs=cvector) ecteq *

# ftbs3 ftb12 fcm7 constant

This sets up the model with the error correction term. Note that, as described above, there is no CONSTANT as the constant is included in the cointegrating vector.

system(model=ectmodel)

variables ftbs3 ftb12 fcm7

lags 1 to 6

ect ecteq

end(system)

This estimates the model and computes the decomposition of variance:

estimate

compute sigma=%sigma

errors(model=ectmodel,steps=36)

ESTIMATE with a model with ECT elements defines the matrices %VECMPI and %VECMALPHA. %VECMPI is the estimated value of \(\Pi\) from \eqref{eq:var_cointvar} and %VECMALPHA are the \(\alpha\) loadings from \eqref{eq:var_cointpidef}. Note well that the lag sums (used in the Blanchard-Quah and the short-and-long run restrictions) are much more complicated for a VECM model than a regular VAR as the long-run response matrix isn't full rank. (If x and y are cointegrated, then if x has a zero long-run response, y must as well). See the SHORTANDLONGVECM.RPF example for the proper way to handle this.

If we needed two cointegrating vectors rather than one (in this case, we really don’t), we would need to use the VECTORS option on @JOHMLE to get the full set of eigenvectors, then %XCOL to pull out the coefficients for each of the two equations that we would need to define. For instance,

@johmle(lags=6,det=rc,vectors=cvectors)

# ftbs3 ftb12 fcm7

equation(coeffs=%xcol(cvectors,1)) ect1 *

# ftbs3 ftb12 fcm7 constant

equation(coeffs=%xcol(cvectors,2)) ect2 *

# ftbs3 ftb12 fcm7 constant

system(model=ect2model)

variables ftbs3 ftb12 fcm7

lags 1 to 6

ect ect1 ect2

end(system)

estimate

errors(model=ect2model,steps=36)