Robust Estimation

We concern ourselves with estimating \(\beta\) in the model

\begin{equation} y_t = X_t \beta + u_t \label{eq:linreg_basereg} \end{equation}

While least squares gives consistent and asymptotically Normal estimates of \(\beta\) under a fairly broad range of conditions, it’s also well known that it is sensitive to outliers or a fat-tailed \(u\) distribution—see, for instance, the discussion in Greene (2012), Chapter 7. It’s possible to drop “outliers” from the data set. For instance, the following is from example file ROBUST.RPF. This estimates a regression by least squares, computes the standardized (more precisely the internally studentized) residuals and reruns the regression dropping from the sample any data point where the standardized residual is greater than 2.5 in absolute value.

linreg logy / resids

# constant logk logl

prj(xvx=px)

set stdresids = resids/sqrt(%seesq*(1-px))

linreg(smpl=abs(stdresids)<=2.5) logy

# constant logk logl

However, if those observations can reasonably be assumed to be valid within the model \eqref{eq:linreg_basereg}, and not just recording errors or data points where the model simply fails, it might make more sense to choose an alternative estimator which won’t be as sensitive as least squares to tail values.

One such estimator is LAD (Least Absolute Deviations), which is

\begin{equation} \hat \beta = \mathop {{\rm{minimizer}}}\limits_\beta \sum\limits_t {\left| {y_t - X_t \beta } \right|} \end{equation}

This is provided by the RATS instruction RREG (Robust Regression). LAD is consistent and asymptotically Normal under broader conditions than are required for least squares. It will be less efficient if the \(u\)’s are better behaved, with about 60% efficiency for Normal \(u\)’s. Its main drawback relative to least squares is that it is much more difficult to compute—the minimand isn’t differentiable, and it requires a specialized variant of linear programming to compute the estimator. There is also some question about the best way to estimate the covariance matrix—for details, see the description of RREG. RREG (for doing LAD) has a similar form to LINREG. In ROBUST.RPF, the command is simply:

rreg logy

# constant logk logl

Some alternative robust estimators can be generated which have behavior in the tails similar to LAD, but near zero are more like least squares. For instance, the following objective function:

\begin{equation} \sum\limits_t {\frac{{u_t^2 }}{{(c^2 + u_t^2 )^{1/2} }}} \end{equation}

will look like the absolute value in the tails, when \(u\) is much larger than the constant \(c\), but will look like the square when \(c\) is larger than \(u\) and thus the constant dominates the denominator. By keeping the minimand differentiable, this can be estimated more simply: either by iterated weighted least squares or by a standard nonlinear estimation routine.

Because iterated weighted least squares (IWLS) takes the previous set of residuals as given in computing the SPREAD series for the next iteration, the result will give an inconsistent estimate of the covariance matrix of the estimates as it does not take into account the dependence of the SPREAD series on \(\beta\). The procedure described in "Recomputing a Covariance Matrix" needs to be used. The full procedure for doing IWLS including the recomputed covariance matrix is shown in the ROBUST.RPF example.