More General Maximization

MAXIMIZE is designed for many of the estimation problems that specialized instructions like LINREG and NLLS cannot handle. Its primary purpose is to estimate models by maximum likelihood, but it is more general than that.

MAXIMIZE is perhaps most often used for estimating the less-standard variations of ARCH and GARCH models that the built-in GARCH instruction can’t handle.

The problems that MAXIMIZE can solve are those of the form

\begin{equation} \mathop {\max }\limits_\beta \sum\limits_{t = 1}^T {f\left( {y_t ,{\kern 1pt} {\kern 1pt} {\kern 1pt} X_t ,{\kern 1pt} {\kern 1pt} {\kern 1pt} \beta } \right)} \end{equation}

where \(f\) is a RATS formula (FRML). Things to note about this:

•MAXIMIZE only does maximizations (directly), but can, of course, do minimizations if you put a negative sign in front of the formula when you define it.

•MAXIMIZE does not check differentiability and will behave unpredictably if you use methods BFGS or BHHH with a function that is not twice-differentiable. Differentiability is not an issue with the derivative-free methods.

If even this is too narrow for your application, you will need to try the instruction FIND.

Setting Up for MAXIMIZE

The steps in preparing to use MAXIMIZE are the same as for NLLS:

•Set the parameter list (PARMSET) using NONLIN.

•Set up the function using FRML.

•Set initial values for the parameters, using COMPUTE or INPUT. In some cases, such as recursive ARCH/GARCH models, you will also need to initialize one or more series used to hold values such as residuals or variances.

A final FRML for use in MAXIMIZE can often come about from combining several submodels. For instance, in a GARCH model, there are separate models for the mean of the process and for its variance and the log likelihood is a function of the two. When you have such a model, it is often a good idea to, as much as possible, create the two separately and combine them only at the end. This can be done by using features of the FRML (use of "sub-FRMLs") and PARMSETS (which can be "added").

Example MAXIMIZE.RPF. estimates a stochastic frontier model. This is basically a log-linear production model, except that the residuals have two components, one of which is not permitted to be positive—you can fall short of, but can’t exceed, the (unobservable) production function, so the error term is asymmetrical.

MAXIMIZE is heavily used in various non-standard GARCH models, and effectively all Markov Switching models.

Multivariate Likelihoods

You can use MAXIMIZE to estimate multivariate Normal likelihood functions using the %LOGDENSITY function. (%LOGDENSITY can be used for both univariate and multivariate Normals). The log likelihood element for an \(n\)-vector at time \(t\) is, in general,

\begin{equation} - \frac{n}{2}\log 2\pi - \frac{1}{2}\log {\kern 1pt} {\kern 1pt} \left| {{\kern 1pt} {\kern 1pt} \Sigma _t {\kern 1pt} } \right| - \frac{1}{2}{\bf{u'}}_t \Sigma _t^{ - 1} {\bf{u}}_t \end{equation}

This is computed by the function %LOGDENSITY(SIGMA,U) where SIGMA is the covariance matrix and U the vector of deviations from the means of the components.

Although %LOGDENSITY is by far the most commonly used, RATS offers more than a dozen other density functions for other distributions, including %LOGTDENSITY and %LOGGEDDENSITY for \(t\) and GED distributions, respectively. All of these density functions include the integrating constants, so models estimated using different distributions will give comparable likelihood values.

The biggest problem is creating the FRML which will allow you to compute this as the final step. It is often a good idea to create a FRML[VECTOR] to compute U and a FRML[SYMMETRIC] for SIGMA. Of course, if \(\Sigma\) doesn’t depend upon time, you don’t need to create a FRML just to compute it. If all parameters of \(\Sigma\) are freely estimated, you can even DECLARE and DIMENSION it and add it to the nonlinear parameter set as a SYMMETRIC array.

CONSUMER.RPF (which is mainly an example of non-linear systems estimation) also includes estimation by maximum likelihood. Because \(\Sigma\) is unconstrained, this should give identical answers to an unrestricted NLSYSTEM. As just mentioned, the covariance matrix is put into the parameter set SIGMAPARMS as an unrestricted \(3 \times 3\) matrix.

dec frml[vect] ufrml

frml ufrml = ||wqfood-ffood,wqvice-fvice,wqdura-fdura||

dec symm sigma(3,3)

nonlin(parmset=sigmaparms) sigma

frml mvlikely = %logdensity(sigma,ufrml)

compute sigma=%identity(3)

maximize(parmset=base+sigmaparms,iters=400) mvlikely