AR1 Instruction |
AR1( options ) depvar start end residuals
# list of regressors (in Regression Format)
AR1 estimates a regression, correcting for first order serially correlated errors. With the INSTRUMENTS option, it does two-stage least squares.
Wizard
AR1 estimation is available using the Statistics—Linear Regressions Wizard. Select either “AR(1)–Regression” or “AR(1)–Instrumental Variables” from the “Method” list.
Parameters
depvar |
dependent variable |
start, end |
If you haven’t set a SMPL, this defaults to the largest range for which all variables involved are defined. If you choose a method which does not retain the initial observation (Cochrane–Orcutt or Hildreth–Lu), RATS will actually run the regressions beginning at start+1, using entry start to provide the lag for start+1. |
residuals |
(optional) series for residuals. Note: residuals are automatically saved in %RESIDS |
Options
Standard Regression Options
METHOD=CORC/[HILU]/MAXL/SEARCH/PW
This chooses the method of estimation. See "Technical Details" for more.
CORC |
is (iterated) Cochrane–Orcutt, or, for instrumental variables, Fair’s (1970) procedure. |
HILU |
(the default) is Hildreth–Lu, a grid search procedure. |
MAXL |
is Beach and MacKinnon’s (1978) maximum likelihood procedure. |
SEARCH |
is a maximum likelihood grid search procedure. |
PW |
is Prais-Winsten, which is similar to SEARCH, but doesn’t use the log variance terms from the likelihood. |
AR1 may not be able to honor your choice. The methods which retain the initial observation (MAXL, SEARCH and PW) cannot be used for instrumental variables and the iterated methods (MAXL and CORC) cannot be used if there are missing observations. AR1 will pick the closest permitted choice when it must switch.
RHO=input value for \(\rho\) [.0001]
Use this if you want to input the value of \(\rho\) rather than having it estimated.
CVCRIT=convergence criterion for rho [.0001]
The goal in each of the methods described above is to reach a point where the estimate of \(\rho\) changes by less than the convergence criterion.
INSTRUMENTS/[NOINSTRUMENTS]
WMATRIX=weighting matrix [not used]
Use the INSTRUMENTS option to do two-stage least squares. You must set your instruments list first using the instruction INSTRUMENTS.
DEFINE=equation to define
FRML=formula to define
These define an equation and formula, respectively, for forecasting purposes. The equation (or formula) created incorporates the serial correlation within it, so that an estimated model of
\({y_t}{\rm{ = }}20.0{\rm{ }} + {\rm{ }}2.5{x_t}{\rm{ }} + {\rm{ }}{u_t}{\rm{ }},{\rm{ }}{u_t}{\rm{ }} = {\rm{ }}.8{u_{t - 1}}{\rm{ }} + {\rm{ }}{\varepsilon _t}\)
produces the equivalent equation
\({y_t}{\rm{ }} = {\rm{ }}.8{y_{t - 1}}{\rm{ }} + {\rm{ }}4.0{\rm{ }} + {\rm{ }}2.5{x_t}{\rm{ }} - {\rm{ }}2.0{x_{t - 1}}{\rm{ }} + {\rm{ }}{\varepsilon _t}\)
HETEROGENOUS/[NOHETEROGENOUS]
PRHOS=series of estimated rho values
These only apply to panel data sets. If you use HETEROGENOUS, AR1 estimates a separate \(\rho\) for each cross-sectional unit. This is a simple two-step estimation procedure, with no iteration—none of the METHOD choices apply. With HETEROGENOUS, you can use PRHOS to save the series of estimated \(\rho\) values for each individual.
SPREAD=Standard SPREAD option [unused]
Variables
%RHO |
estimated (or input) \(\rho\) coefficient (REAL) |
%BETA |
coefficient VECTOR for the base regression only (not including \(\rho\)) |
%XX |
covariance matrix estimator for the base regression only (not including \(\rho\)) (SYMMETRIC) |
%BETASYS |
coefficient VECTOR including \(\rho\) |
%XXSYS |
covariance matrix estimator including \(\rho\) (SYMMETRIC) |
Examples
This estimates a simple regression with AR1 errors using an input value of \(\rho\), Prais-Winsten and (iterated) maximum likelihood.
ar1(rho=.792) c
# constant y
ar1(method=pw) c
# constant y
ar1(method=maxl) c
# constant y
This estimates a multiple regression using Prais-Winsten, Cochrane-Orcutt and maximum likelihood grid search.
ar1(method=pw) loggpop
# constant logpg logypop logpnc logpuc
ar1(method=corc) loggpop
# constant logpg logypop logpnc logpuc
*
ar1(method=search) loggpop
# constant logpg logypop logpnc logpuc
*
ar1(method=maxl,define=firsteq) housing * 1988:10
# constant construct{0 to 6} rates{0 to 9}
This estimates an equation using AR1-instrumental variables. The first is done with Cochrane-Orcutt (or more correctly Fair's algorithm) and the second by grid search.
instruments constant cons{1} dy{1} gnp{1} govt dm rsum{1} rate{4} invest{1} dy{2} rate{5}
ar1(method=corc,inst) invest
# constant dy{1} gnp rate{4}
*
ar1(method=hilu,inst) invest
# constant dy{1} gnp rate{4}
For the following model with first-order serially correlated errors
(1) \({y_t} = {X_t}\beta + {u_t}\,\,\,,\,\,\,\,{u_t} = \rho {\kern 1pt} {u_{t - 1}} + {\varepsilon _t}\)
the (log) likelihood function, assuming Normality, is
(2) \(\frac{{ - T}}{2}\log \left( {2\pi } \right) - \frac{T}{2}\log \left( {{\sigma ^2}} \right) + \frac{1}{2}\log \left( {1 - {\rho ^2}} \right) - \frac{1}{{2{\sigma ^2}}}\left( {1 - {\rho ^2}} \right){\left( {{y_1} - {X_1}\beta } \right)^2} - \)
\(\frac{1}{{2{\sigma ^2}}}\left\{ {\sum\limits_{t = 2}^T {{{\left( {{y_t} - \rho {y_{t - 1}} - \left( {{X_t} - \rho {X_{t - 1}}} \right)\beta } \right)}^2}} } \right\}\)
•METHOD=MAXL and METHOD=SEARCH maximize this function. MAXL does this by an iterative procedure while SEARCH uses an efficient grid search.
•METHOD=CORC and METHOD=HILU minimize the part in braces, with CORC using an iterative procedure and HILU doing a grid search.
•METHOD=PW is a GLS procedure which includes terms for the first observation. It minimizes the part in braces plus \(\left( {1 - {\rho ^2}} \right){\left( {{y_1} - {X_1}\beta } \right)^2}\)
The goal in every method is to reach a point where the estimate of \(\rho\) changes by less than convergence criterion (set by the CVCRIT option). This usually takes more trials with the search procedures. However, the search procedures guarantee that you have found the global optimum.
The objective function for two-stage least squares is
(3) \(\sum {\left( {{u_t} - \rho {\kern 1pt} {\kern 1pt} {u_{t - 1}}} \right){\kern 1pt} {\kern 1pt} } {Z_t}{\left( {{\bf{Z'Z}}} \right)^{ - 1}}{Z'_t}{\kern 1pt} {\kern 1pt} \left( {{u_t} - \rho {\kern 1pt} {\kern 1pt} {u_{t - 1}}} \right)\)
where Z is the vector of instruments. Given \(\rho\), \(\beta\) is estimated by two-stage last squares of \({y_t} - \rho {y_{t - 1}}\) on \({X_t} - \rho {X_{t - 1}}\). If you choose METHOD=HILU, AR1 uses a search procedure to minimize (3) over \(\rho\). If you use METHOD=CORC, given \(u = y - X\beta \), \(\rho\) is estimated by
(4) \(\frac{{\sum {{u_t}{u_{t - 1}}} }}{{\sum {u_{t - 1}^2} }}\)
Missing Values
If there are any missing values within the data range, the simple iterative process described above for the Cochrane–Orcutt (CORC) and Beach–MacKinnon (MAXL) estimators can’t be used, since there will be terms missing in (4). If you have requested one of these, the most similar search procedure will be used instead.
If there is a gap of s periods in the data before period t, the likelihood function will include the extra term
(5) \(\frac{1}{2}\log \left( {\frac{{1 - {\rho ^2}}}{{1 - {\rho ^{2s}}}}} \right)\)
and the term for t in the sum in (2) is replaced by
(6) \({\left( {{y_t} - {\rho ^s}{y_{t - s}} - \left( {{X_t} - {\rho ^s}{X_{t - s}}} \right)\beta } \right)^2}\left( {\frac{{1 - {\rho ^2}}}{{1 - {\rho ^{2s}}}}} \right)\)
Both of these are to adjust for the fact that
(7) \({u_t}|{u_{t - s}}\sim N\left( {{\rho ^s}{u_{t - s}},{\sigma ^2}\left( {1 + {\rho ^2} + {\rho ^4} + \ldots + {\rho ^{2(s - 1)}}} \right)} \right)\)
Hypothesis Tests
You can use any of the hypothesis testing instructions after AR1, but you can’t test RHO using EXCLUDE or SUMMARIZE.
Fitted Values and Forecasting
You can use the PRJ instruction to get fitted values after AR1. PRJ can also compute forecasts of AR1 models. Or, if you use DEFINE to save the estimated equation, you can use UFORECAST or FORECAST to get forecasts.
Sample Output
Regression with AR1 - Estimation by Hildreth-Lu Search
Dependent Variable RATE
Monthly Data From 1959:04 To 1996:02
Usable Observations 443
Degrees of Freedom 438
Centered R^2 0.9679799
R-Bar^2 0.9676875
Uncentered R^2 0.9945027
Mean of Dependent Variable 6.0806546275
Std Error of Dependent Variable 2.7714419161
Standard Error of Estimate 0.4981857000
Sum of Squared Residuals 108.70677837
Regression F(4,438) 3310.2261
Significance Level of F 0.0000000
Log Likelihood -317.4010
Durbin-Watson Statistic 1.6508
Q(36-1) 144.4029
Significance Level of Q 0.0000000
Variable Coeff Std Error T-Stat Signif
*********************************************************************************
1. Constant -53.04561425 30.83708607 -1.72019 0.08610443
2. IP 0.28415004 0.05371758 5.28970 0.00000019
3. GRM2 -65.27152869 9.73912096 -6.70199 0.00000000
4. GRPPI{1} 6.45196308 2.95281378 2.18502 0.02941644
*********************************************************************************
5. RHO 0.99916389 0.00522081 191.38118 0.00000000
The \(R^2\), other summary statistics and the residuals are based upon the complete model—so they use the \(\varepsilon\)’s, not the \(u\)’s. The estimate of %RHO is separated from the other regressors in the regression output. AR1 computes the standard errors and covariance matrix from a linearization of the objective function.
If you use the HETEROGENOUS option with panel data, AR1 omits the output for %RHO, and computes the standard errors from the second stage regression on the quasi-differenced data.
Higher-order Autocorrelation Corrections
To estimate a model with corrections for higher-order serial correlation, the simplest choice is BOXJENK with the GLS option. For instance,
boxjenk(gls,ar=2) y
# constant x1 x2
You can use also use NLLS. For instance, the following includes an AR(2) correction:
nonlin rho1 rho2
linreg y
# constant x1 x2
frml(lastreg,names="B",addparms) regfrml
frml auto1 = rho1*y{1} + rho2*y{2} + regfrml(t) - $
rho1*regfrml(t-1) - rho2*regfrml(t-2)
compute rho1=rho2=0.1
nlls(frml=auto1) y
Copyright © 2025 Thomas A. Doan