AR1 Instruction

AR1( options ) depvar start end residuals

# list of regressors (in Regression Format)

AR1 estimates a regression, correcting for first order serially correlated errors. With the INSTRUMENTS option, it does two-stage least squares.

Wizard

AR1 estimation is available using the Statistics>Linear Regressions Wizard. Select either “AR(1)–Regression” or “AR(1)–Instrumental Variables” from the “Method” list.

Parameters

depvar	dependent variable
start, end	If you haven’t set a SMPL, this defaults to the largest range for which all variables involved are defined. If you choose a method which does not retain the initial observation (Cochrane–Orcutt or Hildreth–Lu), RATS will actually run the regressions beginning at start+1, using entry start to provide the lag for start+1.
residuals	(optional) series for residuals. Note: residuals are automatically saved in %RESIDS

Options

Standard Regression Options

METHOD=CORC/[HILU]/MAXL/SEARCH/PW

This chooses the method of estimation. See "Technical Details" for more.

CORC	is (iterated) Cochrane–Orcutt, or, for instrumental variables, Fair’s (1970) procedure.
HILU	(the default) is Hildreth–Lu, a grid search procedure.
MAXL	is Beach and MacKinnon’s (1978) maximum likelihood procedure.
SEARCH	is a maximum likelihood grid search procedure.
PW	is Prais-Winsten, which is similar to SEARCH, but doesn’t use the log variance terms from the likelihood.

AR1 may not be able to honor your choice. The methods which retain the initial observation (MAXL, SEARCH and PW) cannot be used for instrumental variables and the iterated methods (MAXL and CORC) cannot be used if there are missing observations. AR1 will pick the closest permitted choice when it must switch.

RHO=input value for $\rho$ [.0001]

Use this if you want to input the value of $\rho$ rather than having it estimated.

CVCRIT=convergence criterion for rho [.0001]

The goal in each of the methods described above is to reach a point where the estimate of $\rho$ changes by less than the convergence criterion.

INSTRUMENTS/[NOINSTRUMENTS]

WMATRIX=weighting matrix [not used]

Use the INSTRUMENTS option to do two-stage least squares. You must set your instruments list first using the instruction INSTRUMENTS.

DEFINE=equation to define

FRML=formula to define

These define an equation and formula, respectively, for forecasting purposes. The equation (or formula) created incorporates the serial correlation within it, so that an estimated model of

${y_t}{\rm{ = }}20.0{\rm{ }} + {\rm{ }}2.5{x_t}{\rm{ }} + {\rm{ }}{u_t}{\rm{ }},{\rm{ }}{u_t}{\rm{ }} = {\rm{ }}.8{u_{t - 1}}{\rm{ }} + {\rm{ }}{\varepsilon _t}$

produces the equivalent equation

${y_t}{\rm{ }} = {\rm{ }}.8{y_{t - 1}}{\rm{ }} + {\rm{ }}4.0{\rm{ }} + {\rm{ }}2.5{x_t}{\rm{ }} - {\rm{ }}2.0{x_{t - 1}}{\rm{ }} + {\rm{ }}{\varepsilon _t}$

HETEROGENOUS/[NOHETEROGENOUS]

PRHOS=series of estimated rho values

These only apply to panel data sets. If you use HETEROGENOUS, AR1 estimates a separate $\rho$ for each cross-sectional unit. This is a simple two-step estimation procedure, with no iteration—none of the METHOD choices apply. With HETEROGENOUS, you can use PRHOS to save the series of estimated $\rho$ values for each individual.

SPREAD=Standard SPREAD option [unused]

Variables

Regression Variables

%RHO	estimated (or input) $\rho$ coefficient (REAL)
%BETA	coefficient VECTOR for the base regression only (not including $\rho$)
%XX	covariance matrix estimator for the base regression only (not including $\rho$) (SYMMETRIC)
%BETASYS	coefficient VECTOR including $\rho$
%XXSYS	covariance matrix estimator including $\rho$ (SYMMETRIC)

Examples

This estimates a simple regression with AR1 errors using an input value of $\rho$, Prais-Winsten and (iterated) maximum likelihood.

ar1(rho=.792) c

# constant y

ar1(method=pw) c

# constant y

ar1(method=maxl) c

# constant y

This estimates a multiple regression using Prais-Winsten, Cochrane-Orcutt and maximum likelihood grid search.

ar1(method=pw) loggpop

# constant logpg logypop logpnc logpuc

ar1(method=corc) loggpop

# constant logpg logypop logpnc logpuc

ar1(method=search) loggpop

# constant logpg logypop logpnc logpuc

ar1(method=maxl,define=firsteq) housing * 1988:10

# constant construct{0 to 6} rates{0 to 9}

This estimates an equation using AR1-instrumental variables. The first is done with Cochrane-Orcutt (or more correctly Fair's algorithm) and the second by grid search.

instruments constant cons{1} dy{1} gnp{1} govt dm rsum{1} rate{4} invest{1} dy{2} rate{5}

ar1(method=corc,inst) invest

# constant dy{1} gnp rate{4}

ar1(method=hilu,inst) invest

# constant dy{1} gnp rate{4}

Technical Details

For the following model with first-order serially correlated errors

(1) ${y_t} = {X_t}\beta + {u_t}\,\,\,,\,\,\,\,{u_t} = \rho {\kern 1pt} {u_{t - 1}} + {\varepsilon _t}$

the (log) likelihood function, assuming Normality, is

(2) $\frac{{ - T}}{2}\log \left( {2\pi } \right) - \frac{T}{2}\log \left( {{\sigma ^2}} \right) + \frac{1}{2}\log \left( {1 - {\rho ^2}} \right) - \frac{1}{{2{\sigma ^2}}}\left( {1 - {\rho ^2}} \right){\left( {{y_1} - {X_1}\beta } \right)^2} - $

$\frac{1}{{2{\sigma ^2}}}\left\{ {\sum\limits_{t = 2}^T {{{\left( {{y_t} - \rho {y_{t - 1}} - \left( {{X_t} - \rho {X_{t - 1}}} \right)\beta } \right)}^2}} } \right\}$

•METHOD=MAXL and METHOD=SEARCH maximize this function. MAXL does this by an iterative procedure while SEARCH uses an efficient grid search.

•METHOD=CORC and METHOD=HILU minimize the part in braces, with CORC using an iterative procedure and HILU doing a grid search.

•METHOD=PW is a GLS procedure which includes terms for the first observation. It minimizes the part in braces plus $\left( {1 - {\rho ^2}} \right){\left( {{y_1} - {X_1}\beta } \right)^2}$

The goal in every method is to reach a point where the estimate of $\rho$ changes by less than convergence criterion (set by the CVCRIT option). This usually takes more trials with the search procedures. However, the search procedures guarantee that you have found the global optimum.

The objective function for two-stage least squares is

(3) $\sum {\left( {{u_t} - \rho {\kern 1pt} {\kern 1pt} {u_{t - 1}}} \right){\kern 1pt} {\kern 1pt} } {Z_t}{\left( {{\bf{Z'Z}}} \right)^{ - 1}}{Z'_t}{\kern 1pt} {\kern 1pt} \left( {{u_t} - \rho {\kern 1pt} {\kern 1pt} {u_{t - 1}}} \right)$

where Z is the vector of instruments. Given $\rho$, $\beta$ is estimated by two-stage last squares of ${y_t} - \rho {y_{t - 1}}$ on ${X_t} - \rho {X_{t - 1}}$. If you choose METHOD=HILU, AR1 uses a search procedure to minimize (3) over $\rho$. If you use METHOD=CORC, given $u = y - X\beta $, $\rho$ is estimated by

(4) $\frac{{\sum {{u_t}{u_{t - 1}}} }}{{\sum {u_{t - 1}^2} }}$

Missing Values

If there are any missing values within the data range, the simple iterative process described above for the Cochrane–Orcutt (CORC) and Beach–MacKinnon (MAXL) estimators can’t be used, since there will be terms missing in (4). If you have requested one of these, the most similar search procedure will be used instead.

If there is a gap of s periods in the data before period t, the likelihood function will include the extra term

(5) $\frac{1}{2}\log \left( {\frac{{1 - {\rho ^2}}}{{1 - {\rho ^{2s}}}}} \right)$

and the term for t in the sum in (2) is replaced by

(6) ${\left( {{y_t} - {\rho ^s}{y_{t - s}} - \left( {{X_t} - {\rho ^s}{X_{t - s}}} \right)\beta } \right)^2}\left( {\frac{{1 - {\rho ^2}}}{{1 - {\rho ^{2s}}}}} \right)$

Both of these are to adjust for the fact that

(7) ${u_t}|{u_{t - s}}\sim N\left( {{\rho ^s}{u_{t - s}},{\sigma ^2}\left( {1 + {\rho ^2} + {\rho ^4} + \ldots + {\rho ^{2(s - 1)}}} \right)} \right)$

Hypothesis Tests

You can use any of the hypothesis testing instructions after AR1, but you can’t test RHO using EXCLUDE or SUMMARIZE.

Fitted Values and Forecasting

You can use the PRJ instruction to get fitted values after AR1. PRJ can also compute forecasts of AR1 models. Or, if you use DEFINE to save the estimated equation, you can use UFORECAST or FORECAST to get forecasts.

Sample Output

Regression with AR1 - Estimation by Hildreth-Lu Search

Dependent Variable RATE

Monthly Data From 1959:04 To 1996:02

Usable Observations 443

Degrees of Freedom 438

Centered R^2 0.9679799

R-Bar^2 0.9676875

Uncentered R^2 0.9945027

Mean of Dependent Variable 6.0806546275

Std Error of Dependent Variable 2.7714419161

Standard Error of Estimate 0.4981857000

Sum of Squared Residuals 108.70677837

Regression F(4,438) 3310.2261

Significance Level of F 0.0000000

Log Likelihood -317.4010

Durbin-Watson Statistic 1.6508

Q(36-1) 144.4029

Significance Level of Q 0.0000000

Variable Coeff Std Error T-Stat Signif

*********************************************************************************

1. Constant -53.04561425 30.83708607 -1.72019 0.08610443

2. IP 0.28415004 0.05371758 5.28970 0.00000019

3. GRM2 -65.27152869 9.73912096 -6.70199 0.00000000

4. GRPPI{1} 6.45196308 2.95281378 2.18502 0.02941644

*********************************************************************************

5. RHO 0.99916389 0.00522081 191.38118 0.00000000

The $R^2$, other summary statistics and the residuals are based upon the complete model—so they use the $\varepsilon$’s, not the $u$’s. The estimate of %RHO is separated from the other regressors in the regression output. AR1 computes the standard errors and covariance matrix from a linearization of the objective function.

If you use the HETEROGENOUS option with panel data, AR1 omits the output for %RHO, and computes the standard errors from the second stage regression on the quasi-differenced data.

Higher-order Autocorrelation Corrections

To estimate a model with corrections for higher-order serial correlation, the simplest choice is BOXJENK with the GLS option. For instance,

boxjenk(gls,ar=2) y

# constant x1 x2

You can use also use NLLS. For instance, the following includes an AR(2) correction:

nonlin rho1 rho2

linreg y

# constant x1 x2

frml(lastreg,names="B",addparms) regfrml

frml auto1 = rho1*y{1} + rho2*y{2} + regfrml(t) - $

rho1*regfrml(t-1) - rho2*regfrml(t-2)

compute rho1=rho2=0.1

nlls(frml=auto1) y

%RHO	estimated (or input) \(\rho\) coefficient (REAL)
%BETA	coefficient VECTOR for the base regression only (not including \(\rho\))
%XX	covariance matrix estimator for the base regression only (not including \(\rho\)) (SYMMETRIC)
%BETASYS	coefficient VECTOR including \(\rho\)
%XXSYS	covariance matrix estimator including \(\rho\) (SYMMETRIC)