PANEL Instruction

PANEL( options ) series start end newseries newstart

PANEL computes one of a number of important transformations required for operating with panel or generally grouped data series. You can use it to implement regressions on panel data which can’t be done by using PREGRESS.

You can also use the SPREAD option to compute an individual by individual variance series, which can be used in a SPREAD option on LINREG for weighted least squares.

PANEL has an older syntax which is still supported, which includes extra parameters (rather than options) to describe the desired transformation. See "Older Syntax" below.

Parameters

series	source series
start, end	range to transform, by default, range of series
newseries	result series, by default, series
newstart	start for newseries, by default the same as series

Options

ENTRY=Weight on value of series [0.0]

INDIV=Weight on the individual mean [0.0]

TIME=Weight on the time mean [0.0]

ICOUNT=Weight on the number of individuals [0.0]

TCOUNT=Weight on the number of time periods[0.0]

ISUM=Weight on the individual sum [0.0]

TSUM=Weight on the time sum [0.0]

These supply the weights on the various components. See the "Description" below for details.

EFFECTS=[INDIVIDUAL]/TIME/BOTH

This indicates whether to allow for INDIVIDUAL effects, TIME effects or BOTH. This applies when you use the GLS and related options or the DUMMIES option.

GROUP=SERIES or FRML with values defining individuals

This is an alternative to a panel data setup for data. This defines the individuals. If you use GROUP, you can’t do any calculations which require identifying specific time periods (TIME, TCOUNT or TSUM components, EFFECTS=TIME or EFFECTS=BOTH).

VRANDOM=(input) variance of the random component

VINDIV=(input) variance of the individual component

VTIME=(input)variance of the time component

GLS=[STANDARD]/FORWARDS/BACKWARDS

If you’re doing a GLS transformation for random effects, using options like INDIV and TIME requires you to take the estimated component variances and convert them into the proper linear combination of entries and averages. And a single linear combination will only work if your data set is a balanced panel. As an alternative, you can input the component variances using the VRANDOM, VINDIV and (if necessary) VTIME options and let PANEL do the calculations.

There are several ways to “factor” the random effects covariance matrix to get this transformation. Which one is chosen is controlled by the GLS option. GLS=STANDARD does the standard symmetrical transformation using individual and time means. That can be used with any choice for EFFECTS. The other two choices for GLS can be used only with EFFECTS=INDIV. GLS=FORWARDS does a transformation using only "forward" calculations of means, while GLS=BACKWARDS uses only “backwards” means. The forward mean at t for individual i is the mean using only x(i,s) for s>=t.

The choice for the GLS option won’t affect the results if you use the transformed data in a standard LINREG instruction. It will matter if you are using it as an input to (for instance) an instrumental variables estimator.

KR=across time periods covariance matrix of residuals [not used]

This is used to do the Keane-Runkle (1992) transformation, which is a transformation within each individual using a forwards factorization of a general covariance matrix across time periods. This requires a balanced panel data set.

SMPL=Standard SMPL option [unused]

COMPRESS/[NOCOMPRESS]

If the transformation takes individual statistics only, or time statistics only, you can use COMPRESS. It eliminates the repetitions and creates a series of length N for INDIV and length T for TIME (as opposed to length \(N \times T\)).

ID=VECTOR of (sorted) individual or group values [not used]

IDENTRIES=VECTOR[VECTOR[INTEGER]] with the list of entries for each individual in ID [not used]

If you have to do any calculation over the entries covered by an individual that is more complicated than the means and variances done with the other options of PANEL, you will need a simple way to organize those calculations. If you have a balanced panel data set, that’s fairly easy. It’s harder if you have an unbalanced panel data set or general grouped data. For those situations, you can use the ID and IDENTRIES options. ID returns a VECTOR of values of the grouping variable (in sorted order), while IDENTRIES provides (for each value in ID) the corresponding set of entry numbers. These would be used something like the example below, which walks through the individuals (the I loop) and then the entries for that individual (the J loop, with IT being the entry number).

panel(group=p_cusip,id=vid,identries=identries)

do i=1,%size(vid)

do j=1,%size(identries(i))

compute it=identries(i)(j)

...

end do j

end do i

DUMMIES=VECT[SERIES] of dummies [not used]

This generates a VECT[SERIES] of dummy variables for EFFECTS=INDIV or EFFECTS=TIME. (If you need both, do two separate PANEL instructions).

SPREAD=(output) series of individual variances [unused]

With SPREAD, PANEL computes a “SPREAD” series by setting each entry equal to the sample variance of series for the entries of its cross-section. You should do this separately from other transformations. This SPREAD series is in a form directly usable in a SPREAD option for LINREG. Note that this computes a centered variance.

Description

With \({y_{it}},i = 1, \ldots ,N;t = 1, \ldots ,T\) representing series, for entry it:

ENTRY	\({y_{it}}\)
INDIV	\({y_{i \bullet }}\), the mean of y for individual i, averaged across t
TIME	\({y_{ \bullet t}}\), the mean of y for time t, averaged across i
MEAN	mean across all entries
ISUM	sum across t of y for individual i
TSUM	sum across i of y for time period t
ICOUNT	number of valid observations (time periods) in the current individual. That is, for \({y_{it}}\), ICOUNT returns the count of valid observations across all t for individual i
TCOUNT	number of valid observations across individuals for the current time period. That is, for \({y_{it}}\), TCOUNT returns the count of valid observations across all i at time period t.

For example, you would use the following options to create the indicated series:

For \({y_{it}} - {y_{i \bullet }}\), use options ENTRY=1.0,INDIV=-1.0

For \({y_{it}} - \theta {y_{ \bullet t}}\), use options ENTRY=1.0,TIME=-THETA

Variables Defined

%NGROUP

number of individuals or groups used in the computation that actually contain data. Does not include individuals/groups where all time periods are empty (missing values) (INTEGER)

Missing Values

PANEL removes any missing values from any average it calculates.

Examples

panel(entry=1.0,time=-1.0,smpl=oecd) lnrxrate / cxrate

computes the deviations from time period means for LNRXRATE, using only the entries for OECD countries.

panel(effects=time,dummies=tdummies) constant

creates TDUMMIES as a set of time period dummies.

panel(group=townid,icount=1.0) %resids / bcount

dofor ss = mv crim zn indus chas nox rm age dis rad tax ptratio b lstat

panel(group=townid,indiv=1.0) ss / %s("p_"+%l(ss))

end dofor ss

This creates BCOUNT as a series of counts of (valid) data points for each individual, and P_MV to P_LSTAT as the individual averages of a group of data series. All these use the TOWNID series to identify the individuals.

linreg(robust) lgaspcar

# lincomep lrpmg constant idummies

panel(spreads=countryvar) %resids

linreg(spread=countryvar) lgaspcar

# lincomep lrpmg constant idummies

This does a two-step feasible weighted least squares allowing for each individual to have a different variance.

panel(indiv=1.0) kids / kidsbar

panel(indiv=1.0) lhinc / lhincbar

* Chamberlain's pooled probit

ddv(dist=probit) lfp

# kids lhinc kidsbar lhincbar constant per1 per2 per3 per4 educ black age agesq

This uses PANEL to get the individual averages of the KIDS and LHINC series.

Older Syntax

PANEL( options ) series start end newseries newstart weight pairs

Old Parameters for Transformation

weight pairs

These are an older, but still supported way to enter the transformation. Examples are

ENTRY 1.0 INDIV -1.0

A weight pair consists of a keyword (same as the names of the option above) followed by the value. You can do as many as you want on a single instruction.