Systems Estimation (Non-linear)

NLSYSTEM vs. SUR

The instruction NLSYSTEM estimates a system of equations by non-linear least squares or (for instrumental variables) by the generalized method of moments. The analogous instruction for linear systems is SUR. NLSYSTEM uses formulas (FRMLs) rather than the equations used by SUR. For models which can be estimated with both instructions, it is slower than SUR for two reasons:

•SUR does not have to compute derivatives with each iteration (the derivatives of the linear functions are just the explanatory variables).

•SUR uses every possible means to eliminate duplicate calculations. The cross-products of regressors, dependent variables, and instruments can be computed just once.

For a linear model without complicated restrictions, use SUR. For a linear model with restrictions, NLSYSTEM may be simpler to set up and the speed difference is likely to matter only with a large model or very large data set.

NLSYSTEM vs. NLLS

Like NLLS, NLSYSTEM can do either instrumental variables or (multivariate) least squares. In most ways, NLSYSTEM is just a multivariate extension of NLLS. There are three important differences, though:

•When you use NLLS, you specify the dependent variable on the NLLS instruction itself. With NLSYSTEM, you must include the dependent variable when you define the FRML.

•NLSYSTEM with INST will automatically compute the optimal weighting scheme described in the last section.

•Primarily because of the preceding point, you will rarely use ROBUSTERRORS with NLSYSTEM. ROBUSTERRORS, in fact, is a request to compute a sub-optimal estimator and correct its covariance matrix.

Technical Details

\begin{equation} {\bf{u}}_t = \left( {u_{1t} , \ldots ,u_{nt} } \right)^\prime \end{equation}

is the vector of residuals at time \(t\) (\(u\) depends upon \(\beta\)), and

\begin{equation} \Sigma = E{\kern 1pt} {\bf{u}}_t {\bf{u'}}_t \end{equation}

Multivariate non-linear least squares solves

\begin{equation} \mathop {\min }\limits_\beta \sum\limits_t {{\bf{u'}}_t {\kern 1pt} {\kern 1pt} \Sigma ^{ - 1} {\kern 1pt} {\bf{u}}_t } \label{eq:systemnonlin_mindist} \end{equation}

For instrumental variables, further define

\begin{equation} {\bf{Z}}_t = \left( {z_{1t} ,{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \ldots {\kern 1pt} {\kern 1pt} {\kern 1pt} ,z_{rt} } \right)^\prime \end{equation}

as the vector of instruments at \(t\), and

\begin{equation} {\bf{G}}\left( \beta \right) = \sum\limits_t {{\bf{u}}_t \otimes {\bf{Z}}_t } \end{equation}

then Generalized Method of Moments solves

\begin{equation} \mathop {\min }\limits_\beta \;{\bf{G}}\left( \beta \right)^\prime \left[ {{\bf{SW}}} \right]\;{\bf{G}}\left( \beta \right) \label{eq:systemnonlin_gmmminimand} \end{equation}

where \(SW\) is the system weighting matrix for the orthogonality conditions. By default, it is just \(\Sigma ^{ - 1} \otimes \left( {{\bf{Z'Z}}} \right)^{ - 1} \), where \(\bf{Z}\) is the \(T \times r\) matrix of instruments for the entire sample.

In either case, the estimation process depends upon certain “nuisance parameters”—the estimates will change with \(\Sigma\) and \(SW\). Unless you use options to feed in values for these, NLSYSTEM recomputes them after each iteration. The objective function is thus changing from one iteration to the next. As a result, the function values that are printed if you use the TRACE option will often seem to be moving in the wrong direction, or at least not changing. However, the old versus new values on any single iteration will always be improving.

The continuous recalculation of weight matrices for instrumental variables can sometimes create problems if you use the ZUDEP option. This allows for general dependence between the instruments (“Z”) and residuals (“u”). You may not be able to freely estimate the covariance matrix of \(n \times r\) moment conditions with the available data. Even if \(n \times r\) is less than the number of data points (so the matrix is at least invertible), the weight matrices may change so much from iteration to iteration that the estimates never settle down. If this happens, you will need to switch to some type of “sub-optimal” weight matrix, such as that obtained with NOZUDEP, and use the ROBUSTERRORS option to correct the covariance matrix.

Multivariate Least Squares

The first order necessary conditions for minimizing \eqref{eq:systemnonlin_mindist} are

\begin{equation} \sum\limits_t {\frac{{\partial {\bf{u'}}_t }}{{\partial \beta }}{\kern 1pt} {\kern 1pt} \Sigma ^{ - 1} {\kern 1pt} {\bf{u}}_t } = 0 \label{eq:systemnonlin_mindistfonc} \end{equation}

The sum in this is the gradient \(\bf{g}\)). Ignoring the second derivatives of \(u\), a first order expansion of the gradient at \(\beta_k\) is

\begin{equation} \sum\limits_t {\frac{{\partial {\bf{u'}}_t }}{{\partial \beta }}{\kern 1pt} {\kern 1pt} \Sigma ^{ - 1} {\kern 1pt} {\bf{u}}_t } + \left( {\sum\limits_t {\frac{{\partial {\bf{u'}}_t }}{{\partial \beta }}{\kern 1pt} {\kern 1pt} \Sigma ^{ - 1} {\kern 1pt} \frac{{\partial {\bf{u}}_t }}{{\partial \beta }}} } \right){\kern 1pt} {\kern 1pt} \left( {\beta - \beta _k } \right) \end{equation}

Setting this to zero and solving for \(\beta\) puts this into the general “hill-climbing” framework (if minimization is converted to maximization) with

\begin{equation} {\bf{G}} = \left( {\sum\limits_t {\frac{{\partial {\bf{u'}}_t }}{{\partial \beta }}\;\Sigma ^{ - 1} {\kern 1pt} \frac{{\partial {\bf{u}}_t }}{{\partial \beta }}} } \right)^{ - 1} \end{equation}

\(\bf{G}\) ends up being the estimate of the covariance matrix of the estimates. If you use ROBUSTERRORS, the recomputed covariance matrix is

\begin{equation} {\bf{G}}\,\,{\rm{mcov(}}{\bf{v}},1)\,\,{\bf{G}} \end{equation}

where

\begin{equation} {\bf{v}}_t = \frac{{\partial {\bf{u'}}_t }}{{\partial \beta }}{\kern 1pt} {\kern 1pt} \Sigma ^{ - 1} {\kern 1pt} {\bf{u}}_t \end{equation}

which are the summands from \eqref{eq:systemnonlin_mindistfonc}.

The example CONSUMER.RPF uses NLSYSTEM to estimate equations from an expenditure system. While the system is actually linear, it's much easier to handle the various restrictions across equations.

GMM Estimation

GMM estimation uses similar methods—the first order conditions of the optimization problem \eqref{eq:systemnonlin_gmmminimand} are expanded, ignoring the second derivative of the residuals. The estimated covariance matrix of coefficients is

\begin{equation} {\bf{A}} = \left( {\frac{{\partial {\bf{G'}}}}{{\partial \beta }}\left[ {{\bf{SW}}} \right]\frac{{\partial {\bf{G}}}}{{\partial \beta }}} \right)^{ - 1} \end{equation}

If you use the ROBUSTERRORS option (with a sub-optimal weight matrix), the covariance matrix becomes \(\bf{A}\bf{B}\bf{A}\), with \(\bf{B}\) dependent on the options chosen:

with NOZUDEP, ROBUSTERRORS and LAGS=0:

\begin{equation} {\bf{B}} = \left( {\frac{{\partial {\bf{G}}}}{{\partial \beta }}} \right)^\prime \left[ {{\bf{SW}}} \right]{\kern 1pt} {\kern 1pt} {\kern 1pt} \left( {\Sigma \otimes {\bf{Z'Z}}} \right){\kern 1pt} {\kern 1pt} {\kern 1pt} \left[ {{\bf{SW}}} \right]{\kern 1pt} {\kern 1pt} {\kern 1pt} \left( {\frac{{\partial {\bf{G}}}}{{\partial \beta }}} \right) \end{equation}

with ZUDEP, ROBUSTERRORS and LAGS=0, or NOZUDEP, ROBUSTERRORS and LAGS>0:

\begin{equation} {\bf{B}} = \left( {\frac{{\partial {\bf{G}}}}{{\partial \beta }}} \right)^\prime \left[ {{\bf{SW}}} \right]{\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{mcov}}\left( {{\bf{Z}} \otimes {\bf{u}},1} \right){\kern 1pt} {\kern 1pt} {\kern 1pt} \left[ {{\bf{SW}}} \right]{\kern 1pt} {\kern 1pt} {\kern 1pt} \left( {\frac{{\partial {\bf{G}}}}{{\partial \beta }}} \right) \end{equation}

The example file CHANKAROLYI.RPF estimates the model for interest rates from Chan, et al (1992).