Statistics and Algorithms / Vector Autoregressions /

VAR: Hypothesis Testing

There are relatively few interesting hypotheses which you can test using just the estimates of a single equation from a VAR. Even the block F–tests produced by ESTIMATE, which indicate whether variable z helps to forecast variable x one-step ahead, are not, individually, especially important. z can, after all, still affect x through the other equations in the system.

Thus, most hypotheses will include more than one equation. The testing procedure to use is the Likelihood Ratio. The test statistic we recommend is

\begin{equation} \left( {T - c} \right)\left( {\log \left| {\Sigma _r } \right| - \log \left| {\Sigma _u } \right|} \right) \label{eq:vartestingformula} \end{equation}

where \(\Sigma _r\) and \(\Sigma _u\) are the restricted and unrestricted covariance matrices and \(T\) is the number of observations. Under certain conditions, this is asymptotically distributed as a \(\chi ^2\) with degrees of freedom equal to the number of restrictions. \(c\) is a correction to improve small sample properties: Sims (1980, p.17) suggests using a correction equal to the number of variables in each unrestricted equation in the system. This is a slight (asymptotically negligible) rescaling of the standard likelihood ratio test statistic by \((T-c)/T\) to be more conservative given that the VAR might be using a high percentage of the degrees of freedom. To help make this correction, ESTIMATE sets the variable %NREG equal to the number of regressors per equation, and %NREGSYSTEM to the number in the whole VAR. You would take as \(c\) the value of %NREG from the biggest VAR that you estimate.

Note, by the way, that some hypotheses might have a non-standard distribution in the presence of unit roots (Sims, Stock and Watson, 1990). Their result won’t affect the lag length test (VARLAG.RPF example), but could affect the exogeneity test (VARCAUSE.RPF).

You can compute \eqref{eq:vartestingformula} yourself using the variable %LOGDET defined by ESTIMATE, or you can use the special instruction RATIO. To use RATIO, you need to save the two sets of series of residuals. To compute the statistic directly, you need to save the %LOGDET values into variables with different names after each of the ESTIMATE instructions. The examples use both techniques, but in practice, you only need to do one or the other.

VARLAG.RPF formally tests one overall lag length vs another. This has to be done carefully, as the two VAR’s need to be run over the same range, while the one with shorter lags would naturally use the extra observations. This is an alternative to using the @VARLAGSELECT procedure to help pick the lag lengths. (The example also shows that).

VARCAUSE.RPF does a block exogeneity test. This has as its null hypothesis that the lags of one set of variables do not enter the equations for the remaining variables. This is the multivariate generalization of Granger–Sims causality tests. Note that only block exclusions have the type of properties that one would want in a causality test—a test excluding just Z from the X equation in an (X,Y,Z) doesn't really tell you about the overall dynamic relationship between Z and X because Z could "cause" Y, which would then likely affect X at the longer run. Instead, the useful hypotheses are Z being excluded from both X and Y, or Y and Z excluded from X.