BOOTSIMPLE.RPF

BOOTSIMPLE.RPF is a simple example of bootstrapping. This uses a bootstrap to determine the sampling distribution of the sample mean of a series.

Full Program

all 32
open data rental.wks
data(format=wks,org=obs) / rent no rm sex dist
*
* Dataset is rental data from Pindyck and Rubinfeld, Econometric Models
* and Econometric Forecasts, 4th edition, p 54. We're asked to test the
* hypothesis that rent per person (rent/no) has a mean of 135 against
* the alternative that it isn't.
*
set rpp = rent/no
*
stats rpp
*
* For comparison, we'll do a standard t-test
*
cdf ttest sqrt(%nobs)*(%mean-135)/sqrt(%variance) %nobs-1
*
compute testmean = 135.0
compute sampmean = %mean
*
compute ndraws = 1000
set means 1 ndraws = 0.0
*
* To compute significance levels for a two-tailed test, you need to
* decide where the cutoff will be on the other side of the hypothesized
* value. Here, upperlim and lowerlim are symmetrically placed around
* testmean. sigcount will count the number of times the resampled values
* fall outside these bounds.
*
* A one-tailed test is simpler, since you just have to count the number
* of times the resampled statistic is more extreme than the observed one.
*
compute sigcount = 0.0
compute upperlim = %if(sampmean>testmean,sampmean,testmean*2-sampmean)
compute lowerlim = %if(sampmean>testmean,testmean*2-sampmean,sampmean)
*
* For each draw, resample the data set using boot, compute the mean of
* the drawn sample and adjust it to give us the sampling distribution
* around the hypothesized mean.
*
do draw=1,ndraws
   boot entries 1 32
   sstats(mean) 1 32 rpp(entries(t))>>%mean
   compute means(draw)=%mean-sampmean+testmean
   compute sigcount=sigcount+(means(draw)<lowerlim.or.means(draw)>upperlim)
end do draws
*
* This shows the estimated significance level of the test along with the
* 90% confidence band, computed by translating the sampling distribution
* to zero (by subtracting testmean), then flipping to deal with possible
* asymmetries.
*
stats(fractiles) means
display "**** Test of Mean=" testmean " ****"
display "Sample mean" sampmean
display "Significance level" #.#### (sigcount+1)/(ndraws+1)
display "90% confidence interval" (sampmean+(testmean-%fract95)) (sampmean+(testmean-%fract05))

Output

The results after the first STATISTICS instruction and first t-statistic depend upon random numbers and so will not be exactly reproducible.

Statistics on Series RPP

Observations 32

Sample Mean 138.169271 Variance 2219.798604

Standard Error 47.114739 SE of Sample Mean 8.328788

t-Statistic (Mean=0) 16.589361 Signif Level (Mean=0) 0.000000

Skewness 1.067821 Signif Level (Sk=0) 0.018768

Kurtosis (excess) 2.315341 Signif Level (Ku=0) 0.016919

Jarque-Bera 13.229029 Signif Level (JB=0) 0.001341

t(31)= 0.380520 with Significance Level 0.70615446

Statistics on Series MEANS

Observations 1000

Sample Mean 135.114667 Variance 66.588051

Standard Error 8.160150 SE of Sample Mean 0.258047

t-Statistic (Mean=0) 523.605672 Signif Level (Mean=0) 0.000000

Skewness 0.171562 Signif Level (Sk=0) 0.026999

Kurtosis (excess) 0.132936 Signif Level (Ku=0) 0.392498

Jarque-Bera 5.641930 Signif Level (JB=0) 0.059548

Minimum 108.096354 Maximum 168.260417

01-%ile 117.968750 99-%ile 154.121094

05-%ile 122.070573 95-%ile 148.597526

10-%ile 124.882292 90-%ile 145.431771

25-%ile 129.293620 75-%ile 140.711589

Median 134.763021

**** Test of Mean= 135.00000 ****

Sample mean 138.16927

Significance level 0.6973

90% confidence interval 124.57174 151.09870