SWEEP Instruction |
SWEEP( options ) start end
# list of target variables (in Regression Format)
# list of instrument variables (in Regression Format)
Performs a regression of a set of "targets" on a set of "instruments". Note that SWEEP produces no direct output. Instead, use the variables it defines.
Parameters
start, end |
estimation range. By default, maximum range permitted by all variables involved in the regression, including instruments if required. |
Options
SMPL=standard SMPL option [unused]
SPREAD=standard SPREAD option [unused]
WEIGHT=standard WEIGHT option [unused]
EQUATION=EQUATION to estimate [unused]
MODEL=MODEL to estimate [unused]
DEPVAR/[NODEPVAR]
Omit the "target" supplementary card if you use either of these options. Use DEPVAR if you want the dependent variables from the EQUATION or MODEL included as target variables.
INSTRUMENTS/[NOINSTRUMENTS]
Use INSTRUMENTS if you want to use the current list defined by an INSTRUMENTS instruction rather than listing them on a supplementary card. Omit the "instrument" supplementary card if you use this option.
COEFFS=(output) m x n matrix of coefficients
IBETA=(output) VECT[RECT] of coefficients by group
For m instruments and n targets, COEFFS saves the estimated coefficients into an m x n array. With k groups, IBETA creates a k-VECTOR of m x n arrays with the coefficients for each group in a separate element of IBETA.
SERIES=VECT[SERIES] for residuals or fitted values
PRJ=[RESIDUALS]/FITTED
For n targets, SERIES creates an n–element VECTOR of SERIES containing the series of residuals or fitted values (depending on the choice for PRJ) for each target.
CVOUT=Output \(\Sigma\) matrix
CVOUT allows you to save the final estimate of the covariance matrix.
GROUP=series or FRML with values on which to group regressions
VARIANCES=[HOMOGENEOUS]/HETEROGENEOUS
AVERAGE=[COUNT]/SIMPLE/PRECISION
If you use GROUP, a separate regression will be run for each unique value in the series or formula. By default (i.e. if GROUP is not used), SWEEP does one regression across the entire sample. The VARIANCES option indicates whether the variances are HOMOGENEOUS (same across groups) or HETEROGENEOUS (different).
The AVERAGE option indicates how the coefficient vectors for the different groups are combined. COUNT weights them by the size of the group, SIMPLE weights each group equally, PRECISION weights by the precision of the estimates.
Variables Defined
%BETA |
averaged vector of regression coefficients (VECTOR) |
%XX |
covariance matrix of averaged regression coefficients (SYMMETRIC) |
%NOBS |
number of observations (INTEGER) |
%NREG |
number of instruments (INTEGER) |
%NVAR |
number of target variables (INTEGER) |
%NGROUP |
number of groups (INTEGER) |
%NREGSYSTEM |
number of regressors in the full system (INTEGER) |
%NFREE |
number of free coefficients in the full system, including variances/covariance matrices (INTEGER) |
%SIGMA |
covariance matrix of residuals (SYMMETRIC) |
%LOGL |
log likelihood (REAL) |
Examples
This uses SWEEP to do a variety of tests for a structural break at 1978:4 in a vector autoregression. The first SWEEP has VARIANCE=HETERO, so it allows for both coefficients and variances to change, the second allows neither (no GROUP option, so everything is fixed across the sample), while the third allows the coefficients to change, but has a single covariance matrix. %NFREE counts the total number of free parameters in the system, including both regressors and covariance matrices. This model has 21 regressors and 6 free parameters in a covariance matrix (with 3 variables), so the fully breaking model will have a total of 2 x 21 + 2 x 6 = 54 free parameters; the fully fixed will have 21 + 6, and the coefficients breaking but covariance matrix being fixed will have 2 x 21 + 6. You don't have to compute those yourself since SWEEP defines %NFREE. This saves the log likelihoods and free parameter count after each SWEEP instruction and uses them to form the likelihood ratio tests.
*
* Lutkepohl, New Introduction, example from page 608.
* VAR; Test for structural break
*
open data e1.dat
calendar(q) 1960
data(format=prn,org=columns,skips=6) 1960:01 1982:04 invest income cons
*
set dinc = log(income/income{1})
set dcons = log(cons/cons{1})
set dinv = log(invest/invest{1})
*
* This uses SWEEP to do the various regressions with breaks.
* GROUP=(t<=1978:4) breaks the sample up by the two values taken by the
* dummy t<=1978:4. VAR=HETERO vs VAR=HOMO determines whether the
* covariance matrix also is different across groups or is the same.
*
* SWEEP defines as %NFREE the total number of regression coefficients
* plus the total number of free parameters in the covariance
* matrix(matrices).
*
* This is hypothesis H1 - coefficients and covariance matrix are freely
* estimated in each subsample.
*
sweep(group=t<=1978:4,var=hetero)
# dinc dcons dinv
# constant dinc{1 2} dcons{1 2} dinv{1 2}
compute loglh1=%logl,ncoh1=%nfree
*
* This is hypothesis H2 - everything is fixed across the sample.
*
sweep
# dinc dcons dinv
# constant dinc{1 2} dcons{1 2} dinv{1 2}
compute loglh2=%logl,ncoh2=%nfree
*
* This is hypothesis H3 - coefficients change, covariance matrix is fixed.
*
sweep(group=t<=1978:4,var=homo)
# dinc dcons dinv
# constant dinc{1 2} dcons{1 2} dinv{1 2}
compute loglh3=%logl,ncoh3=%nfree
*
cdf(title="LR Test for Full Break at 1978:4 vs Fully Fixed") chisqr 2.0*(loglh1-loglh2) ncoh1-ncoh2
cdf(title="LR Test for Full Break at 1978:4 vs Coefficients Only") chisqr 2.0*(loglh1-loglh3) ncoh1-ncoh3
This does a panel causality test (for m causing y) allowing heterogeneity in the coefficients and variances. The first SWEEP is the unrestricted model (including lags of m) and the second excludes the lags of m. This uses %NREGSYSTEM (which counts the regressors only) rather than %NFREE since the covariances are handled the same way. However, there would be no harm in using %NFREE in both locations instead—the difference between the values to get the degrees of freedom will be the same.
cal(panelobs=40)
open data panelmoney.xls
data(org=obs,format=xls) 1//1 19//40 realm realy
*
* Number of lags
*
compute p=3
*
set dy = realy-realy{1}
set dm = realm-realm{1}
*
* Joint test. This first creates a SMPL series which
* will exclude any data points where any of the
* current or lagged values are missing, so both the
* restricted and unrestricted regressions use the
* same entries.
*
inquire(valid=fullsmpl,reglist)
# dy{0 to p} dm{0 to p}
*
sweep(group=%indiv(t),smpl=fullsmpl,var=hetero)
# dy
# constant dy{1 to p} dm{1 to p}
compute loglunr=%logl,nregunr=%nregsystem
sweep(group=%indiv(t),smpl=fullsmpl,var=hetero)
# dy
# constant dy{1 to p}
compute loglres=%logl,nregres=%nregsystem
cdf(title="Heterogeneous Panel Causality Test") chisqr $
2.0*(loglunr-loglres) nregunr-nregres
compute jointtest=%cdstat,jointsignif=%signif
Copyright © 2025 Thomas A. Doan