PREGRESS Instruction

PREGRESS(options) depvar start end resids

# list of explanatory variables in Regression Format

Estimates a panel data regression, incorporating either fixed or random effects.

Wizard

You can use the Statistics—Panel Data Regressions Wizard to do panel regressions.

Parameters

depvar	dependent variable
start, end	range to estimate, defaults to maximum range permitted by all variables involved in the regression, including instruments if required.
resid	residuals series

Description

This estimates \(\beta \) in the linear regression

(1) \({y_{it}} = {X_{it}}\beta + {u_{it}}\), where

(2) \({u_{it}} = {\varepsilon _i} + {\lambda _t} + {\eta _{it}}\), unless METHOD=SUR

\(\varepsilon\) is the individual effect, \(\lambda\) is the time effect and \(\eta\) the purely random effect. If you use the option EFFECTS=INDIV, or METHOD=FD, the decomposition only includes the \(\varepsilon\) and \(\eta\) components. With EFFECTS=TIME, it only includes \(\lambda\) and \(\eta\).

METHOD=POOLED just estimates (1) by least squares, with no panel effects. METHOD=BETWEEN estimates (1) by least squares on individual averages. If you use METHOD=FIXED, \(\varepsilon_i\) and \(\lambda_t\) are treated as constants and are “swept” out. With METHOD=RANDOM, they are treated as part of the error term and \(\beta \) is estimated by GLS. If METHOD=FD (first difference), the data are differenced to eliminate \(\varepsilon_i\). METHOD=SUR assumes that the \({u}\)’s are serially uncorrelated, but are correlated across i at a given t.

For random effects estimation, you can input the variances of the components yourself using the VRANDOM, VINDIV and VTIME options, or you can allow PREGRESS to estimate them. There are many ways to estimate consistently these variances; most of the commonly used choices can be implemented by a combination of the VCOMP and CORRECTION options.

Options

Standard Regression Options

EFFECTS=[INDIVIDUAL]/TIME/BOTH

This indicates whether to allow for INDIVIDUAL effects, TIME effects or BOTH.

METHOD=[FIXEDEFFECTS]/RANDOMEFFECTS/FD/SUR/BETWEEN/POOLED

This chooses between fixed effects, random effects, first-difference, cross-section SUR, "between" estimators, and pooled panel regressions.

GROUP=SERIES or FRML with values defining individuals

This is an alternative to a panel data setup for data. This defines the individuals. If you use GROUP, you can only do EFFECTS=INDIV.

VRANDOM=variance of the random component [estimated]

VINDIV=variance of the individual component [estimated]

VTIME=variance of the time component [estimated]

VCOMP=[WK]/SA/WH/ML/GREENE/WOOLDRIDGE

CORRECTION=[FULL]/DEGREES/NONE

You can input the component variances for random effects using VRANDOM, VINDIV and (if necessary) VTIME. If you don’t use these, PREGRESS will estimate them. The algorithm used for estimating the variances of the components is controlled by the VCOMP and CORRECTION options. With the exception of VCOMP=ML (maximum likelihood), these all solve a set of equations using quadratic forms in residuals for various estimators with different choices for the quadratic forms, estimators, and level of detail in the multipliers in the equations. Any of these will give consistent estimators. VCOMP=WK (the default) is Wansbeek-Kapteyn, VCOMP=SA is Swamy-Arora, VCOMP=WH is Wallace-Hussain, VCOMP=ML is maximum likelihood. VCOMP=GREENE and VCOMP=WOOLDRIDGE are (simpler) estimators proposed in the Greene (2012) and Wooldridge (2010) textbooks. The CORRECTION option controls the level of detail used in computing the coefficients in the quadratic forms.

INDIV=(output) series of individual effects [not used]

TIME=(output) series of time effects [not used]

With METHOD=FIXED or METHOD=RANDOM, these allow you to retrieve the coefficients on the individual or time components; whichever ones are estimated based upon your choice for the EFFECTS option. These are produced to match up with the entries on the original data, so, for instance, the output values for the individual effects will be repeated for each time period within each individual’s block of entries. If you want to compress out the duplicates, you can use the PANEL instruction with the COMPRESS option.

INSTRUMENTS/[NOINSTRUMENTS]

Use the INSTRUMENTS option to do instrumental variables. You must set your instruments list first using the instruction INSTRUMENTS. This can be used with any of the METHOD options except SUR.

ROBUSTERRORS/[NOROBUSTERRORS]

CLUSTER=SERIES with category values for clustered calculation

The combination of ROBUSTERRORS and CLUSTER allows the calculation of coefficient standard errors which are robust to arbitrary correlation within groups defined by the CLUSTER expression. This can be applied to any choice of METHOD except SUR. CLUSTER=%INDIV(T) (or LWINDOW=PANEL) would be used for standard errors clustered by individuals.

ROBUSTERRORS/[NOROBUSTERRORS]

LAGS=correlated lags [0]

LWINDOW=NEWEYWEST/BARTLETT/DAMPED/PARZEN/QUADRATIC/[FLAT]/PANEL/WHITE

LWFORM=VECTOR with the window form [not used]

DAMP=value of \(\gamma\) for LWINDOW=DAMPED [0.0]

These can be applied to any estimation method other than METHOD=SUR to permit calculation of a consistent covariance matrix allowing for heteroscedasticity (with ROBUSTERRORS) or serial correlation within each individual (with LAGS).

HAUSMAN/NOHAUSMAN

If you do METHOD=RANDOM, the HAUSMAN option requests that a Hausman test (for random vs fixed effects) be done. Computing this requires the fixed effects estimate. Because VCOMP=WK or VCOMP=SA each do a fixed effects regression anyway, if you choose one of those, PREGRESS will do the Hausman test. This option is necessary only if you want the Hausman test and you are using a different choice for the component variances.

Variables Defined

Regression Variables

%VRANDOM	variance of \(\eta\), the random component (REAL)
%VINDIV	variance of \(\varepsilon\), the individual component (REAL)
%VTIME	variance of \(\lambda\), the time component (REAL)
%NGROUP	number of individuals or groups (INTEGER)
%SIGMA	covariance matrix for METHOD=SUR (SYMMETRIC)
%NFREE	number of free parameters. This includes any fixed effects coefficients that aren't reported directly and variance or covariance matrix parameters (INTEGER)
%CDSTAT	Hausman test statistic (if computed) (REAL)
%SIGNIF	Significance of Hausman test (if computed) (REAL)
%NDFTEST	Degrees of freedom of Hausman test (if computed) (REAL)

Technical Information

The different choices for VCOMP use different quadratic forms in residuals to estimate the component variances. VCOMP=WK (Wansbeek-Kapteyn) estimates fixed effects and uses its residuals several ways. VCOMP=SA (Swamy-Arora) uses residuals from fixed effects and “between” estimators. VCOMP=WH (Wallace-Hussain) uses the OLS residuals several ways. See Baltagi (2008) or the Panel Data e-course for more detail.

To demonstrate how the CORRECTION option works, we’ll look at part of the calculation for VCOMP=WH. The OLS residuals will be

\(\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right){\bf{u}}\)

where \({\bf{u}}\) is the vector of true residuals. The expected value of the sum of squared residuals will be

(3) \(E{\bf{u'}}\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right)\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right){\bf{u}} = trace(E{\bf{uu'}})\left( {{\bf{I}} - {\bf{X}}{{({\bf{X'X}})}^{ - 1}}{\bf{X'}}} \right)\)

\(trace(E{\bf{uu'}})\) will be a matrix with elements which are linear combinations of the component variances. CORRECTION=NONE ignores the \(\bf{x}\) matrices (in effect, acting as if the OLS residuals are the true residuals) and uses as one of its conditions that the sum of squared OLS residuals is equal to \(trace(E{\bf{uu'}})\). CORRECTION=DEGREES uses a correction based on the number of regressors in \(\bf{x}\) rather than their specific values. CORRECTION=FULL computes the exact correction using the structure of \(trace(E{\bf{uu'}})\) and \(\bf{x}\). (This can be rearranged using the properties of the trace to avoid multiplying the potentially very large matrices in (3)).

Examples

preg(method=between) invest

# constant value cap

preg(method=fixed) invest

# constant value cap

preg(method=random,vcomp=wh) invest

# constant value cap

preg(method=random,vcomp=wk) invest

# constant value cap

preg(method=random,vcomp=sa) invest

# constant value cap

preg(method=random,vcomp=ml) invest

# constant value cap

This estimates an equation, allowing for individual effects only, using the between estimator, fixed and random effects, done with several variance components estimators. Note that we left the CONSTANT in the fixed effects equation. This will show a zero coefficient with zero standard error.

preg vfrall

# beertax

preg(effects=both) vfrall

# beertax

estimates an equation (by fixed effects, which is the default), first allowing for individual effects only, then allowing for both individual and time effects.

Notes

If you estimate using fixed effects, the reported degrees of freedom will be reduced by the number of implied “dummy” variables.

For a balanced sample, the coefficient estimates for both fixed and random effects will be identical to those you would get doing the equivalent regression “by hand,” using PANEL to transform the data. The covariance matrix will be slightly different with random effects because of a different estimate of the variance. In an unbalanced sample, the results will be the same for fixed effects, but only for EFFECTS=TIME or EFFECTS=INDIV. Random effects on an unbalanced data set should only be done using PREGRESS.

Sample Output

What's included in the output from PREGRESS depends upon the method used. Fixed effects is a least squares estimator, and so includes various \(R^2\)and other goodness of fit statistics which are appropriate for least squares. This is the output from METHOD=FIXED in the first example above. Note that because the CONSTANT is time-invariant, it is zeroed out by the fixed effects transformation, thus showing the zero coefficient and zero standard errors of a redundant regressor. The degrees of freedom and the regression F both take into account the implied dummies.

Panel Regression - Estimation by Fixed Effects

Dependent Variable INVEST

Panel(20) of Annual Data From 1//1935:01 To 10//1954:01

Usable Observations 200

Degrees of Freedom 188

Centered R^2 0.9440725

R-Bar^2 0.9408002

Uncentered R^2 0.9615675

Mean of Dependent Variable 145.95825000

Std Error of Dependent Variable 216.87529623

Standard Error of Estimate 52.76796595

Sum of Squared Residuals 523478.14739

Regression F(11,188) 288.4996

Significance Level of F 0.0000000

Log Likelihood -1070.7810

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant 0.0000000000 0.0000000000 0.00000 0.00000000

2. VALUE 0.1101238041 0.0118566942 9.28790 0.00000000

3. CAP 0.3100653413 0.0173545028 17.86656 0.00000000

Output from METHOD=RANDOM includes different statistics in the header. Because it's not least squares, \(R^2\)and F aren't included. Instead, it includes the estimated standard deviations of the included components; here, because it's individual effects, it's for the random and individual component. It also includes the Hausman test for random effects vs fixed. The degrees of freedom for this is the number of time-varying regressors. A significant Hausman test would cause one to doubt the assumption that the (random) individual effects are uncorrelated with the regressors.

Panel Regression - Estimation by Random Effects

Dependent Variable INVEST

Panel(20) of Annual Data From 1//1935:01 To 10//1954:01

Usable Observations 200

Degrees of Freedom 197

Mean of Dependent Variable 145.95825000

Std Error of Dependent Variable 216.87529623

Standard Error of Estimate 51.57987730

Sum of Squared Residuals 524115.29716

Log Likelihood -1095.2570

S.D. (eta_it) 52.4926

S.D. (mu_i) 80.2974

Hausman Test(2) 2.337904

Significance Level 0.3106924

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant -57.76720616 27.69741560 -2.08565 0.03701004

2. VALUE 0.10976265 0.01033843 10.61696 0.00000000

3. CAP 0.30794198 0.01707202 18.03782 0.00000000