SWEEP Instruction

SWEEP( options ) start end

# list of target variables (in Regression Format)

# list of instrument variables (in Regression Format)

Performs a regression of a set of "targets" on a set of "instruments". Note that SWEEP produces no direct output. Instead, use the variables it defines.

Parameters

start, end

estimation range. By default, maximum range permitted by all variables involved in the regression, including instruments if required.

Options

SMPL=standard SMPL option [unused]

SPREAD=standard SPREAD option [unused]

WEIGHT=standard WEIGHT option [unused]

EQUATION=EQUATION to estimate [unused]

MODEL=MODEL to estimate [unused]

DEPVAR/[NODEPVAR]

Omit the "target" supplementary card if you use either of these options. Use DEPVAR if you want the dependent variables from the EQUATION or MODEL included as target variables.

INSTRUMENTS/[NOINSTRUMENTS]

Use INSTRUMENTS if you want to use the current list defined by an INSTRUMENTS instruction rather than listing them on a supplementary card. Omit the "instrument" supplementary card if you use this option.

COEFFS=(output) m x n matrix of coefficients

IBETA=(output) VECT[RECT] of coefficients by group

For m instruments and n targets, COEFFS saves the estimated coefficients into an m x n array. With k groups, IBETA creates a k-VECTOR of m x n arrays with the coefficients for each group in a separate element of IBETA.

SERIES=VECT[SERIES] for residuals or fitted values

PRJ=[RESIDUALS]/FITTED

For n targets, SERIES creates an n–element VECTOR of SERIES containing the series of residuals or fitted values (depending on the choice for PRJ) for each target.

CVOUT=Output $\Sigma$ matrix

CVOUT allows you to save the final estimate of the covariance matrix.

GROUP=series or FRML with values on which to group regressions

VARIANCES=[HOMOGENEOUS]/HETEROGENEOUS

AVERAGE=[COUNT]/SIMPLE/PRECISION

If you use GROUP, a separate regression will be run for each unique value in the series or formula. By default (i.e. if GROUP is not used), SWEEP does one regression across the entire sample. The VARIANCES option indicates whether the variances are HOMOGENEOUS (same across groups) or HETEROGENEOUS (different).

The AVERAGE option indicates how the coefficient vectors for the different groups are combined. COUNT weights them by the size of the group, SIMPLE weights each group equally, PRECISION weights by the precision of the estimates.

Variables Defined

%BETA	averaged vector of regression coefficients (VECTOR)
%XX	covariance matrix of averaged regression coefficients (SYMMETRIC)
%NOBS	number of observations (INTEGER)
%NREG	number of instruments (INTEGER)
%NVAR	number of target variables (INTEGER)
%NGROUP	number of groups (INTEGER)
%NREGSYSTEM	number of regressors in the full system (INTEGER)
%NFREE	number of free coefficients in the full system, including variances/covariance matrices (INTEGER)
%SIGMA	covariance matrix of residuals (SYMMETRIC)
%LOGL	log likelihood (REAL)

Examples

This uses SWEEP to do a variety of tests for a structural break at 1978:4 in a vector autoregression. The first SWEEP has VARIANCE=HETERO, so it allows for both coefficients and variances to change, the second allows neither (no GROUP option, so everything is fixed across the sample), while the third allows the coefficients to change, but has a single covariance matrix. %NFREE counts the total number of free parameters in the system, including both regressors and covariance matrices. This model has 21 regressors and 6 free parameters in a covariance matrix (with 3 variables), so the fully breaking model will have a total of 2 x 21 + 2 x 6 = 54 free parameters; the fully fixed will have 21 + 6, and the coefficients breaking but covariance matrix being fixed will have 2 x 21 + 6. You don't have to compute those yourself since SWEEP defines %NFREE. This saves the log likelihoods and free parameter count after each SWEEP instruction and uses them to form the likelihood ratio tests.

* Lutkepohl, New Introduction, example from page 608.

* VAR; Test for structural break

open data e1.dat

calendar(q) 1960

data(format=prn,org=columns,skips=6) 1960:01 1982:04 invest income cons

set dinc = log(income/income{1})

set dcons = log(cons/cons{1})

set dinv = log(invest/invest{1})

* This uses SWEEP to do the various regressions with breaks.

* GROUP=(t<=1978:4) breaks the sample up by the two values taken by the

* dummy t<=1978:4. VAR=HETERO vs VAR=HOMO determines whether the

* covariance matrix also is different across groups or is the same.

* SWEEP defines as %NFREE the total number of regression coefficients

* plus the total number of free parameters in the covariance

* matrix(matrices).

* This is hypothesis H1 - coefficients and covariance matrix are freely

* estimated in each subsample.

sweep(group=t<=1978:4,var=hetero)

# dinc dcons dinv

# constant dinc{1 2} dcons{1 2} dinv{1 2}

compute loglh1=%logl,ncoh1=%nfree

* This is hypothesis H2 - everything is fixed across the sample.

sweep

# dinc dcons dinv

# constant dinc{1 2} dcons{1 2} dinv{1 2}

compute loglh2=%logl,ncoh2=%nfree

* This is hypothesis H3 - coefficients change, covariance matrix is fixed.

sweep(group=t<=1978:4,var=homo)

# dinc dcons dinv

# constant dinc{1 2} dcons{1 2} dinv{1 2}

compute loglh3=%logl,ncoh3=%nfree

cdf(title="LR Test for Full Break at 1978:4 vs Fully Fixed") chisqr 2.0*(loglh1-loglh2) ncoh1-ncoh2

cdf(title="LR Test for Full Break at 1978:4 vs Coefficients Only") chisqr 2.0*(loglh1-loglh3) ncoh1-ncoh3

This does a panel causality test (for m causing y) allowing heterogeneity in the coefficients and variances. The first SWEEP is the unrestricted model (including lags of m) and the second excludes the lags of m. This uses %NREGSYSTEM (which counts the regressors only) rather than %NFREE since the covariances are handled the same way. However, there would be no harm in using %NFREE in both locations instead—the difference between the values to get the degrees of freedom will be the same.

cal(panelobs=40)

open data panelmoney.xls

data(org=obs,format=xls) 1//1 19//40 realm realy

* Number of lags

compute p=3

set dy = realy-realy{1}

set dm = realm-realm{1}

* Joint test. This first creates a SMPL series which

* will exclude any data points where any of the

* current or lagged values are missing, so both the

* restricted and unrestricted regressions use the

* same entries.

inquire(valid=fullsmpl,reglist)

# dy{0 to p} dm{0 to p}

sweep(group=%indiv(t),smpl=fullsmpl,var=hetero)

# dy

# constant dy{1 to p} dm{1 to p}

compute loglunr=%logl,nregunr=%nregsystem

sweep(group=%indiv(t),smpl=fullsmpl,var=hetero)

# dy

# constant dy{1 to p}

compute loglres=%logl,nregres=%nregsystem

cdf(title="Heterogeneous Panel Causality Test") chisqr $

2.0*(loglunr-loglres) nregunr-nregres

compute jointtest=%cdstat,jointsignif=%signif