Examples / BOOTSIMPLE.RPF |
BOOTSIMPLE.RPF is a simple example of bootstrapping. This uses a bootstrap to determine the sampling distribution of the sample mean of a series.
Full Program
all 32
open data rental.wks
data(format=wks,org=obs) / rent no rm sex dist
*
* Dataset is rental data from Pindyck and Rubinfeld, Econometric Models
* and Econometric Forecasts, 4th edition, p 54. We're asked to test the
* hypothesis that rent per person (rent/no) has a mean of 135 against
* the alternative that it isn't.
*
set rpp = rent/no
*
stats rpp
*
* For comparison, we'll do a standard t-test
*
cdf ttest sqrt(%nobs)*(%mean-135)/sqrt(%variance) %nobs-1
*
compute testmean = 135.0
compute sampmean = %mean
*
compute ndraws = 1000
set means 1 ndraws = 0.0
*
* To compute significance levels for a two-tailed test, you need to
* decide where the cutoff will be on the other side of the hypothesized
* value. Here, upperlim and lowerlim are symmetrically placed around
* testmean. sigcount will count the number of times the resampled values
* fall outside these bounds.
*
* A one-tailed test is simpler, since you just have to count the number
* of times the resampled statistic is more extreme than the observed one.
*
compute sigcount = 0.0
compute upperlim = %if(sampmean>testmean,sampmean,testmean*2-sampmean)
compute lowerlim = %if(sampmean>testmean,testmean*2-sampmean,sampmean)
*
* For each draw, resample the data set using boot, compute the mean of
* the drawn sample and adjust it to give us the sampling distribution
* around the hypothesized mean.
*
do draw=1,ndraws
boot entries 1 32
sstats(mean) 1 32 rpp(entries(t))>>%mean
compute means(draw)=%mean-sampmean+testmean
compute sigcount=sigcount+(means(draw)<lowerlim.or.means(draw)>upperlim)
end do draws
*
* This shows the estimated significance level of the test along with the
* 90% confidence band, computed by translating the sampling distribution
* to zero (by subtracting testmean), then flipping to deal with possible
* asymmetries.
*
stats(fractiles) means
display "**** Test of Mean=" testmean " ****"
display "Sample mean" sampmean
display "Significance level" #.#### (sigcount+1)/(ndraws+1)
display "90% confidence interval" (sampmean+(testmean-%fract95)) (sampmean+(testmean-%fract05))
Output
The results after the first STATISTICS instruction and first t-statistic depend upon random numbers and so will not be exactly reproducible.
Statistics on Series RPP
Observations 32
Sample Mean 138.169271 Variance 2219.798604
Standard Error 47.114739 SE of Sample Mean 8.328788
t-Statistic (Mean=0) 16.589361 Signif Level (Mean=0) 0.000000
Skewness 1.067821 Signif Level (Sk=0) 0.018768
Kurtosis (excess) 2.315341 Signif Level (Ku=0) 0.016919
Jarque-Bera 13.229029 Signif Level (JB=0) 0.001341
t(31)= 0.380520 with Significance Level 0.70615446
Statistics on Series MEANS
Observations 1000
Sample Mean 135.114667 Variance 66.588051
Standard Error 8.160150 SE of Sample Mean 0.258047
t-Statistic (Mean=0) 523.605672 Signif Level (Mean=0) 0.000000
Skewness 0.171562 Signif Level (Sk=0) 0.026999
Kurtosis (excess) 0.132936 Signif Level (Ku=0) 0.392498
Jarque-Bera 5.641930 Signif Level (JB=0) 0.059548
Minimum 108.096354 Maximum 168.260417
01-%ile 117.968750 99-%ile 154.121094
05-%ile 122.070573 95-%ile 148.597526
10-%ile 124.882292 90-%ile 145.431771
25-%ile 129.293620 75-%ile 140.711589
Median 134.763021
**** Test of Mean= 135.00000 ****
Sample mean 138.16927
Significance level 0.6973
90% confidence interval 124.57174 151.09870
Copyright © 2025 Thomas A. Doan