Statistics and Algorithms / Vector Autoregressions /

VAR: Setting Up

Formally, a vector autoregression may be written

\begin{equation} {\bf{y}}_t = {\bf{X}}_t \beta + \sum\limits_{s = 1}^p {\Phi _s {\bf{y}}_{t - s} } + {\bf{u}}_t \,\,\,\,\,\,\,\,\,\,\,E\left( {{\bf{u}}_t {\bf{u'}}_t } \right) = \Sigma \label{eq:VARBasic} \end{equation}

where \({\bf{y}}\) is an \(N\)-vector of variables and each \({\Phi _s }\) is an \(N \times N\) matrix. There are a total of \(N^2p\) free coefficients on the lags, plus whatever are included in \(\beta\). The maintained assumption is that \(\bf{u}_t\) is uncorrelated with the regressors in \eqref{eq:VARBasic} and all other lagged values of \(\bf{y}\), in effect, that we have included enough lags in the model to explain the dynamic behavior of \({\bf{y}}\).

You can set up a standard VAR using the Time Series—VAR (Setup/Estimate) wizard, or by using the following instructions directly:

system(model=modelname)

variables list of endogenous variables

lags list of lags

deterministic list of deterministic/additional variables in regression format

end(system)

The lags listed on LAGS are usually consecutive, for instance, 1 TO 12, but you can skip lags (for instance, 1 2 3 6 12). The list of deterministic variables is usually just CONSTANT and possibly seasonal or other dummies, but they can be any variables other than the lagged endogenous variables.

system(model=canusa)

variables usam1 usatbill canm1 cantbill canusxr

lags 1 to 13

det constant

end(system)

defines a five-equation, 13–lag VAR model. Note that these instructions simply define the VAR system. The model then needs to be estimated, which is usually done using the ESTIMATE instruction.

Preliminary Transformations

You should choose the transformation for each series (log, level or other) that you would pick if you were looking at the series individually. Thus, you transform exponentially growing series, such as the price level, money stock, GNP, etc. to logs. You will usually leave in levels non-trending series, such as interest or unemployment rates. This is especially important for interest rates in a VAR including prices or exchange rates (which should be in logs) because real interest rate and parity conditions can be expressed as a linear relationship among the variables.

While preserving interesting linear relationships is desirable, you should not shy away from obvious transformation choices to achieve them. For instance, in Doan, Litterman and Sims (1984), two of the variables in the system were government receipts and expenditures. These are obvious candidates to be run in logs, which makes the deficit a non-linear function of the transformed variables. If we had used levels instead of logs, it would have made studying the budget deficit easier as we would not have needed to linearize. However, the predictions would have been unrealistic since the growth relationship between receipts and expenditures would have been distorted.

Note that very commonly, the transformation to logs uses a 100 multiplier:

set cons = 100*log(gc82)

set inc = 100*log(gyd82)

This makes it easier to interpret Impulse Response Functions as you can directly read off results as percentages (that is, a 7 is 7%) rather than the .07 that would come out of a simple log transformation.

Should I Difference?

Our advice is no, in general. In Box–Jenkins modeling for single series, appropriate differencing is important for several reasons:

•It is impossible to identify the stationary structure of the process using the sample autocorrelations of an integrated series.

•Most algorithms used for fitting ARIMA models will fail when confronted with integrated data.

Neither of these applies to VAR’s. In fact, the result in Fuller (1976, Theorem 8.5.1) shows that differencing produces no gain in asymptotic efficiency in an autoregression, even if it is appropriate. In a VAR, differencing throws information away (for instance, a simple VAR on differences cannot capture a co-integrating relationship), while it produces almost no gain.

Trend or No Trend?

In most economic time series, the best representation of a trend is a random walk with drift (Nelson and Plosser (1982)). Because of this, we would recommend against including a deterministic trend term in your VAR. In the regression

\begin{equation} {\bf{y}}_t = \alpha + \gamma {\kern 1pt} t + \beta _1 {\bf{y}}_{t - 1} + \ldots + \beta _p {\bf{y}}_{t - p} + {\bf{u}}_t \end{equation}

if we expect to see a unit root in the autoregressive part, \(\gamma\) becomes a coefficient on a quadratic trend, while \(\alpha\) picks up the linear trend. As you add variables or lags to the model, there is a tendency for OLS estimates of the VAR coefficients to “explain” too much of the long-term movement of the data with a combination of the deterministic components and initial conditions (Sims, 2000). While this may seem to “improve” the fit in-sample, the resulting model tends to show implausible out-of-sample behavior.

How Many Lags?

If you want to, you can rely on information criteria, such as the Akaike Information Criterion (AIC) or the Schwarz or Bayesian Information Criterion (variously BIC, SBC, SIC), to select a lag length (see the VARLAG.RPF example and the @VARLagSelect procedure). However, these tend to be too conservative for most purposes. Where we are including an identical number of lags on all variables in all equations, the number of parameters goes up very quickly—we add \(N^2\) parameters with each new lag. Beyond the first lag or two, most of these new additions are likely to be unimportant, causing the information criteria to reject the longer lags in favor of shorter ones. The use of priors is an alternative to relying on short lags or data-driven selection methods.

If the data are adequate, it is recommended that you include at least a year’s worth of lags. In fact, a year’s worth plus one extra can deal with seasonal effects, which can be present even with seasonally adjusted data (which sometimes “overadjust”).

Estimation

Once you have defined a VAR model, you need to estimate it. If you use the Time Series—VAR (Setup/Estimate) wizard, this is done automatically. Otherwise, you will normally use the ESTIMATE instruction, although you can also use LINREG or SUR for this. All three methods are described below.

Using ESTIMATE

Once you have set up a system, you can use ESTIMATE. This estimates all equations in the system independently using least-squares. Usually, the syntax is quite simple. The instruction

ESTIMATE start end

•estimates each of the equations over the range start to end.

•prints the output for each, following each with a table of F-tests for exclusion of lags of each of the variables.

•computes and saves the covariance matrix of residuals (required for variance decompositions) in the array %SIGMA.

You can omit start and end if you want to use the maximum range. ESTIMATE is a quick, efficient instruction which makes the fullest use of the structure of the VAR. However, because it does an equation by equation estimate of the full system, you cannot do any hypothesis tests using the regression-based instructions like EXCLUDE and RESTRICT. This is not a serious problem, because there are very few interesting hypotheses which can be expressed as restrictions on an individual equation.

Using LINREG

You can use a set of LINREG instructions if you really want to be able to focus on an individual equation. If you want to compute or print the variance-covariance matrix of the system, you will need to save the residuals and use the instruction VCV to compute and print the VCV matrix.

Using SUR

SUR is more often used for a “near-VAR” (where the equations don’t all have the same explanatory variables). However, even in a full VAR, it can be used if you need to get a full sample covariance matrix and the standard \(\Sigma \otimes \left( {{\bf{X'X}}} \right)^{ - 1} \) isn't correct. For instance, the following (used in implementing Jorda(2005) local projections), skips lags 1 to h-1 in the VAR, which will create serially correlated residuals. SUR is used to estimate the model with “Newey-West” correction to the covariance matrix.

system(model=linvar)

variables ygap infl rate

lags h to h+nlags-1

det constant

end(system)

sur(model=linvar,robust,lwindow=newey,lags=h-1,noprint)