Bootstrapping using the SHUFFLE option

TomDoan · Unread post by **TomDoan** » Tue May 10, 2016 1:08 pm

This does a feasible GLS estimator to correct for heteroscedasticity. The spread series is generated by running a three-step process of least squares, followed by an auxiliary regression of the log squared residuals on log income (allowing for the variance to be proportional to an unknown power of income), followed by weighted least squares. Because the spread series is treated as if known in the weighted least squares regression, the standard errors of the coefficients will almost certainly be underestimated. To get a better estimate of the precision of the two-step process, this uses SHUFFLE options on the LINREG's for the three steps. Note that this requires only a little more than the putting a loop around the original calculation. (This also produces a bootstrapped estimate of the coefficient vector itself).

Code: Select all

open data food.dat
data(format=free,org=columns) 1 40 food income
*
linreg food
# constant income
*
* This sequence does feasible GLS. This first runs a regression of
* log(e^2) on the log of income.
*
set esq = log(%resids^2)
set z   = log(income)
*
linreg esq
# constant z
*
* PRJ then computes the fitted values from the above regression, which
* are then "exp"ed to give the estimated variances. That constructed
* series is fed into LINREG with SPREAD to correct for
* heteroscedasticity.
*
prj vhat
linreg(spread=exp(vhat)) food
# constant income
*
* This does bootstrapping over the previous calculation
*
compute nboot=1000
*
compute [vect] bboot=%zeros(%nreg,1)
compute [symm] xxboot=%zeros(%nreg,%nreg)
*
do boot=1,nboot
   *
   * Draw the bootstrap entries
   *
   boot shuffle
   *
   * Run the original regression using the boostrapped sample
   *
   linreg(shuffle=shuffle,noprint) food
   # constant income
   *
   * Generate ESQ using the original sample data (so it's compatible
   * with the original data)
   *
   set esq = log(%resids^2)
   set z   = log(income)
   *
   linreg(shuffle=shuffle,noprint) esq
   # constant z
   *
   * Get the fitted values (again, this uses the original sample) and
   * run the FGLS estimates.
   *
   prj vhat
   linreg(spread=exp(vhat),shuffle=shuffle,noprint) food
   # constant income
   compute bboot=bboot+%beta,xxboot=xxboot+%outerxx(%beta)
end do boot
*
compute bboot=bboot/nboot
compute xxboot=xxboot/nboot-%outerxx(bboot)
linreg(create,title="FGLS bootstrapping",$
   lastreg,coeffs=bboot,covmat=xxboot,form=chisqr)