RATS 11
RATS 11

Instructions /

PREGRESS Instruction

Home Page

← Previous Next →

PREGRESS(options)  depvar start end resids

# list of explanatory variables in Regression Format
 

Estimates a panel data regression, incorporating either fixed or random effects.

Wizard

You can use the Statistics—Panel Data Regressions Wizard to do panel regressions.

Parameters

depvar

dependent variable

start, end

range to estimate, defaults to maximum range permitted by all variables involved in the regression, including instruments if required.

resid

residuals series

Description

This estimates \(\beta \) in the linear regression

(1)  \({y_{it}} = {X_{it}}\beta  + {u_{it}}\), where

(2)  \({u_{it}} = {\varepsilon _i} + {\lambda _t} + {\eta _{it}}\), unless METHOD=SUR
 

\(\varepsilon\) is the individual effect, \(\lambda\) is the time effect and \(\eta\) the purely random effect. If you use the option EFFECTS=INDIV, or METHOD=FD, the decomposition only includes the \(\varepsilon\) and \(\eta\) components. With EFFECTS=TIME, it only includes \(\lambda\) and \(\eta\).

 

METHOD=POOLED just estimates (1) by least squares, with no panel effects. METHOD=BETWEEN estimates (1) by least squares on individual averages. If you use METHOD=FIXED, \(\varepsilon_i\) and \(\lambda_t\) are treated as constants and are “swept” out. With METHOD=RANDOM, they are treated as part of the error term and \(\beta \) is estimated by GLS. If METHOD=FD (first difference), the data are differenced to eliminate \(\varepsilon_i\). METHOD=SUR assumes that the \({u}\)’s are serially uncorrelated, but are correlated across i at a given t.

 

For random effects estimation, you can input the variances of the components yourself using the VRANDOM, VINDIV and VTIME options, or you can allow PREGRESS to estimate them. There are many ways to estimate consistently these variances; most of the commonly used choices can be implemented by a combination of the VCOMP and CORRECTION options.

Options

Standard Regression Options

 

EFFECTS=[INDIVIDUAL]/TIME/BOTH

This indicates whether to allow for INDIVIDUAL effects, TIME effects or BOTH.

 

METHOD=[FIXEDEFFECTS]/RANDOMEFFECTS/FD/SUR/BETWEEN/POOLED

This chooses between fixed effects, random effects, first-difference, cross-section SUR, "between" estimators, and pooled panel regressions.

 

GROUP=SERIES or FRML with values defining individuals

This is an alternative to a panel data setup for data. This defines the individuals. If you use GROUP, you can only do EFFECTS=INDIV.

 

VRANDOM=variance of the random component [estimated]

VINDIV=variance of the individual component [estimated]

VTIME=variance of the time component [estimated]

VCOMP=[WK]/SA/WH/ML/GREENE/WOOLDRIDGE

CORRECTION=[FULL]/DEGREES/NONE

You can input the component variances for random effects using VRANDOM, VINDIV and (if necessary) VTIME. If you don’t use these, PREGRESS will estimate them. The algorithm used for estimating the variances of the components is controlled by the VCOMP and CORRECTION options. With the exception of VCOMP=ML (maximum likelihood), these all solve a set of equations using quadratic forms in residuals for various estimators with different choices for the quadratic forms, estimators, and level of detail in the multipliers in the equations. Any of these will give consistent estimators. VCOMP=WK (the default) is Wansbeek-Kapteyn, VCOMP=SA is Swamy-Arora, VCOMP=WH is Wallace-Hussain, VCOMP=ML is maximum likelihood. VCOMP=GREENE and VCOMP=WOOLDRIDGE are (simpler) estimators proposed in the Greene (2012) and Wooldridge (2010) textbooks. The CORRECTION option controls the level of detail used in computing the coefficients in the quadratic forms.

 

INDIV=(output) series of individual effects [not used]

TIME=(output) series of time effects [not used]

With METHOD=FIXED or METHOD=RANDOM, these allow you to retrieve the coefficients on the individual or time components; whichever ones are estimated based upon your choice for the EFFECTS option. These are produced to match up with the entries on the original data, so, for instance, the output values for the individual effects will be repeated for each time period within each individual’s block of entries. If you want to compress out the duplicates, you can use the PANEL instruction with the COMPRESS option.

 

INSTRUMENTS/[NOINSTRUMENTS]

Use the INSTRUMENTS option to do instrumental variables. You must set your instruments list first using the instruction INSTRUMENTS. This can be used with any of the METHOD options except SUR.

 

ROBUSTERRORS/[NOROBUSTERRORS]

CLUSTER=SERIES with category values for clustered calculation

The combination of ROBUSTERRORS and CLUSTER allows the calculation of coefficient standard errors which are robust to arbitrary correlation within groups defined by the CLUSTER expression. This can be applied to any choice of METHOD except SUR. CLUSTER=%INDIV(T) (or LWINDOW=PANEL) would be used for standard errors clustered by individuals.

 

ROBUSTERRORS/[NOROBUSTERRORS]

LAGS=correlated lags [0]

LWINDOW=NEWEYWEST/BARTLETT/DAMPED/PARZEN/QUADRATIC/[FLAT]/PANEL/WHITE

LWFORM=VECTOR with the window form [not used]

DAMP=value of \(\gamma\) for LWINDOW=DAMPED [0.0]

These can be applied to any estimation method other than METHOD=SUR to permit calculation of a consistent covariance matrix allowing for heteroscedasticity (with ROBUSTERRORS) or serial correlation within each individual (with LAGS).

 

HAUSMAN/NOHAUSMAN

If you do METHOD=RANDOM, the HAUSMAN option requests that a Hausman test (for random vs fixed effects) be done. Computing this requires the fixed effects estimate. Because VCOMP=WK or VCOMP=SA each do a fixed effects regression anyway, if you choose one of those, PREGRESS will do the Hausman test. This option is necessary only if you want the Hausman test and you are using a different choice for the component variances.

Variables Defined

Regression Variables

%VRANDOM

variance of \(\eta\), the random component (REAL)

%VINDIV

variance of \(\varepsilon\), the individual component (REAL)

%VTIME

variance of \(\lambda\), the time component (REAL)

%NGROUP

number of individuals or groups (INTEGER)

%SIGMA

covariance matrix for METHOD=SUR (SYMMETRIC)

%NFREE

number of free parameters. This includes any fixed effects coefficients that aren't reported directly and variance or covariance matrix parameters (INTEGER)

%CDSTAT

Hausman test statistic (if computed) (REAL)

%SIGNIF

Significance of Hausman test (if computed) (REAL)

%NDFTEST

Degrees of freedom of Hausman test (if computed) (REAL)

Technical Information

The different choices for VCOMP use different quadratic forms in residuals to estimate the component variances. VCOMP=WK (Wansbeek-Kapteyn) estimates fixed effects and uses its residuals several ways. VCOMP=SA (Swamy-Arora) uses residuals from fixed effects and “between” estimators. VCOMP=WH (Wallace-Hussain) uses the OLS residuals several ways. See Baltagi (2008) or the Panel Data e-course for more detail.

 

To demonstrate how the CORRECTION option works, we’ll look at part of the calculation for VCOMP=WH. The OLS residuals will be 

 

\(\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right){\bf{u}}\)

 

where \({\bf{u}}\) is the vector of true residuals. The expected value of the sum of squared residuals will be

 

(3) \(E{\bf{u'}}\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right)\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right){\bf{u}} = trace(E{\bf{uu'}})\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right)\)

 

\(trace(E{\bf{uu'}})\) will be a matrix with elements which are linear combinations of the component variances. CORRECTION=NONE ignores the \(\bf{x}\) matrices (in effect, acting as if the OLS residuals are the true residuals) and uses as one of its conditions that the sum of squared OLS residuals is equal to \(trace(E{\bf{uu'}})\). CORRECTION=DEGREES uses a correction based on the number of regressors in \(\bf{x}\) rather than their specific values. CORRECTION=FULL computes the exact correction using the structure of \(trace(E{\bf{uu'}})\) and \(\bf{x}\). (This can be rearranged using the properties of the trace to avoid multiplying the potentially very large matrices in (3)).

Examples

preg(method=between) invest

# constant value cap

preg(method=fixed) invest

# constant value cap

preg(method=random,vcomp=wh) invest

# constant value cap

preg(method=random,vcomp=wk) invest

# constant value cap

preg(method=random,vcomp=sa) invest

# constant value cap

preg(method=random,vcomp=ml) invest

# constant value cap

 

This estimates an equation, allowing for individual effects only, using the between estimator, fixed and random effects, done with several variance components estimators. Note that we left the CONSTANT in the fixed effects equation. This will show a zero coefficient with zero standard error.


 

preg vfrall

# beertax

preg(effects=both) vfrall

# beertax
 

estimates an equation (by fixed effects, which is the default), first allowing for individual effects only, then allowing for both individual and time effects.

Notes

If you estimate using fixed effects, the reported degrees of freedom will be reduced by the number of implied “dummy” variables.

 

For a balanced sample, the coefficient estimates for both fixed and random effects will be identical to those you would get doing the equivalent regression “by hand,” using PANEL to transform the data. The covariance matrix will be slightly different with random effects because of a different estimate of the variance. In an unbalanced sample, the results will be the same for fixed effects, but only for EFFECTS=TIME or EFFECTS=INDIV. Random effects on an unbalanced data set should only be done using PREGRESS.

Sample Output

What's included in the output from PREGRESS depends upon the method used. Fixed effects is a least squares estimator, and so includes various \(R^2\)and other goodness of fit statistics which are appropriate for least squares. This is the output from METHOD=FIXED in the first example above. Note that because the CONSTANT is time-invariant, it is zeroed out by the fixed effects transformation, thus showing the zero coefficient and zero standard errors of a redundant regressor. The degrees of freedom and the regression F both take into account the implied dummies.

 

 

Panel Regression - Estimation by Fixed Effects

Dependent Variable INVEST

Panel(20) of Annual Data From      1//1935:01 To     10//1954:01

Usable Observations                       200

Degrees of Freedom                        188

Centered R^2                        0.9440725

R-Bar^2                             0.9408002

Uncentered R^2                      0.9615675

Mean of Dependent Variable       145.95825000

Std Error of Dependent Variable  216.87529623

Standard Error of Estimate        52.76796595

Sum of Squared Residuals         523478.14739

Regression F(11,188)                 288.4996

Significance Level of F             0.0000000

Log Likelihood                     -1070.7810

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     0.0000000000 0.0000000000      0.00000  0.00000000

2.  VALUE                        0.1101238041 0.0118566942      9.28790  0.00000000

3.  CAP                          0.3100653413 0.0173545028     17.86656  0.00000000


 

Output from METHOD=RANDOM includes different statistics in the header. Because it's not least squares, \(R^2\)and F aren't included. Instead, it includes the estimated standard deviations of the included components; here, because it's individual effects, it's for the random and individual component. It also includes the Hausman test for random effects vs fixed. The degrees of freedom for this is the number of time-varying regressors. A significant Hausman test would cause one to doubt the assumption that the (random) individual effects are uncorrelated with the regressors.


 

Panel Regression - Estimation by Random Effects

Dependent Variable INVEST

Panel(20) of Annual Data From      1//1935:01 To     10//1954:01

Usable Observations                       200

Degrees of Freedom                        197

Mean of Dependent Variable       145.95825000

Std Error of Dependent Variable  216.87529623

Standard Error of Estimate        51.57987730

Sum of Squared Residuals         524115.29716

Log Likelihood                     -1095.2570

S.D. (eta_it)                         52.4926

S.D. (mu_i)                           80.2974

Hausman Test(2)                      2.337904

Significance Level                  0.3106924

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -57.76720616  27.69741560     -2.08565  0.03701004

2.  VALUE                          0.10976265   0.01033843     10.61696  0.00000000

3.  CAP                            0.30794198   0.01707202     18.03782  0.00000000


 


Copyright © 2025 Thomas A. Doan