Long-Run Variance/Robust Covariance Calculations

A common feature in modern statistics and econometrics is the need to calculate the covariance matrix in what can be written (informally, we make no attempt to restate this as a theorem) as:

\begin{equation} \sum\limits_j {{{\bf{z}}_j}} { \approx _d}N\left( {0,{\mathop{\rm var}} \left( {\sum\limits_j {{{\bf{z}}_j}} } \right)} \right) \label{eq:mcov_general} \end{equation}

under various assumptions about the behavior of \(z\). This is used in linear regressions with \({\rm{z = Xu}}\) to correct the covariance matrix for serial correlation or heteroscedasticity, in GMM with \({\rm{z = Zu}}\) or \({\bf{z}} = {\bf{u}} \otimes {\bf{Z}}\) for weighting moment conditions and in maximum likelihood with \({\rm{z}}\)=partial derivatives to correct for misspecification. The instruction MCOV does direct calculation of this covariance matrix, while the same calculation is included within robust error or weight matrix calculation by instructions such as LINREG or MAXIMIZE.

Eicker-White/Heteroscedasticity/Misspecification Consistent

This is the simplest case. The terms in \eqref{eq:mcov_general} are independent (or close to it), but not identically distributed. The covariance matrix of \(z\) is approximated as

\begin{equation} {\mathop{\rm var}} \left( {\sum\limits_j {{{\bf{z}}_j}} } \right) \approx \sum\limits_j {{{\bf{z}}_j}^\prime {{\bf{z}}_j}} \label{eq:mcov_eickerwhite} \end{equation}

This is done using MCOV with no other options, for correcting covariance matrices with the ROBUSTERRORS option with none of the other options described here, and for GMM weight matrices with ZUDEP with none of the other options.

Serial Correlation (LAGS and LWINDOWS options)

To allow for general (moving average) correlation with \(L\) lags for terms in \(z\), we can use the general calculation

\begin{equation} \sum\limits_{l = - L}^L {\sum\limits_t {{\kern 1pt} {w_l}\left( {{{\bf{z}}_t}^\prime {{\bf{z}}_{t - l}}} \right)} } \label{eq:mcov_lwindow} \end{equation}

where \({{\kern 1pt} {w_l}}\) is a set of window weights. This is chosen using the LAGS option on any of the instructions that allow for robust calculations. If the \(w\)’s are all one (LWINDOW=FLAT), the matrix in \eqref{eq:mcov_lwindow} may fail to be positive semi-definite, which can produce invalid standard errors. Various window types are provided using the LWINDOW option to avoid this. The formulas for windows are most easily described by defining

\begin{equation} v = \frac{{\left| l \right|}}{{L + 1}} \end{equation}

Except for the quadratic window, all the window weights are zero when \(|v| > 1\). Note that the phrase "bandwidth" for these windows usually means \(L+1\), not \(L\) itself. BARTLETT and NEWEYWEST are identical (Bartlett is the historical name for the window in spectral analysis). For more information on lag windows, see Hamilton (1994), pages 281-284.

LWINDOW=FLAT	\(w(v) = 1\)
LWINDOW=NEWEYWEST LWINDOW=BARTLETT	\(w(v) = 1 - v\)
LWINDOW=DAMPED	\(w(v) = {\left( {1 - v} \right)^\gamma }\) . \(\gamma\) is the value of the DAMP option
LWINDOW=PARZEN	\(w(v) = \left\{ {\begin{array}{*{20}{c}} {1 - 6{v^2} + 6{v^3}} & {0 \le v \le .5} \\ {2{{\left( {1 - v} \right)}^2}} & {.5 \le v \le 1} \\ \end{array}} \right.\)
LWINDOW=QUADRATIC	\(w(v) = \frac{3}{{{{\left( {6\pi v/5} \right)}^2}}}\left[ {\frac{{\sin \left( {6\pi v/5} \right)}}{{6\pi v/5}} - \cos \left( {6\pi v/5} \right)} \right]\)

LWINDOW=PANEL, CLUSTER options

When \(L\) is equal to the number of time series observations (thus allowing for arbitrary patterns of serial correlation), \eqref{eq:mcov_lwindow} with unit weights is the same as

\begin{equation} {\left( {\sum\limits_t {{z_t}} } \right)^\prime }\left( {\sum\limits_t {{z_t}} } \right) \end{equation}

This is positive semi-definite, because it’s a matrix times its transpose. For a single time series, however, it’s not very useful, because it has rank one. However, if you add these up across a large number of individuals or categories, you get a full rank, consistent estimator (big N, small T). See, for instance, Greene(2012), section 11.3.2. If this is used in computing a covariance matrix, you get clustered standard errors. LWINDOW=PANEL will do this calculation (clustering on individuals) for a panel data set. If you want to do the clustering on a variable other than this, use the option CLUSTER=series with clustering categories. This series should have a unique value for each category, and should have at least as many categories as you have variables (regressors) if you want a positive definite result, and should have many more if you want consistency.

ZUMEAN and CENTER options

Use the option ZUMEAN=VECTOR of assumed means when the model assumes that the moment conditions have non-zero mean. It replaces \(\bf{z}\) with \({\bf{z}} - {\mu _{\bf{z}}}\) in any of the previous calculations where \({\mu _{\bf{z}}}\) is the VECTOR input by the ZUMEAN option. The CENTER option makes a similar adjustment, but subtracts off the sample mean rather than a hypothesized mean. This is recommended by Hall(2000) for computing the weight matrix to be used in testing overidentifying restrictions.

Multivariate Residuals

If you have more than one series of residuals, the calculation in \eqref{eq:mcov_lwindow} is done as

\begin{equation} \sum\limits_{l = - L}^L {\sum\limits_t {{\kern 1pt} {w_l}\left( {{{\left( {{{\bf{u}}_t} \otimes {{\bf{Z}}_t}} \right)}^\prime }\left( {{{\bf{u}}_{t - l}} \otimes {{\bf{Z}}_{t - l}}} \right)} \right)} } \end{equation}

with the appropriate changes for other options. This arrangement (blocking the matrix by equation) matches with that expected for a weight matrix by the instructions NLSYSTEM and SUR. Note that this matrix will likely not be full rank if the size of \(\bf{Z}\) times the size of \(\bf{u}\) is greater than the number of data points.

The NOZUDEP option can only be used with LAGS=0, and only with NOCENTER. It computes the special case of

\begin{equation} \Sigma \otimes {\bf{Z'Z}} \end{equation}