BOOT Instruction

BOOT( options ) BOOTseries start end lower upper

BOOT creates a SERIES[INTEGERS] and fills all or part of it with random integers. The most common use for this instruction is drawing observations at random from a series or set of series as part of a bootstrapping or randomization operation.

Parameters

BOOTseries	SERIES of INTEGERS created by BOOT
start, end	range of entries in BOOTseries to be set. By default, the standard workspace.
lower, upper	lower and upper bounds (inclusive on both ends) on the value range of random integers. These default to start and end.

Options

[REPLACE]/NOREPLACE

Determines whether or not the sampling will be done with replacement or without replacement. With NOREPLACE, once a value is drawn, it won’t be drawn again for a different element of BOOTseries. With REPLACE, numbers may be drawn more than once. Drawing with replacement is the normal procedure in bootstrapping operations, as the sample is treated as if it were the population from which the data are drawn. Drawing without replacement is typically part of approximate randomization analyses, and is usually done to shuffle the entire entry range.

BLOCK=block size [not used]

METHOD=[OVERLAP]/NOOVERLAP/STATIONARY/CIRCULAR

Used for block bootstrapping. METHOD=OVERLAP allows for overlapping blocks, so every full block within [lower,upper] can be selected. METHOD=NOOVERLAP partitions [lower,upper] into separate blocks of size block size and randomizes among them. For METHOD=STATIONARY, block size can be real-valued. When randomizing an entry, a new start point within [lower,upper] is chosen with probability 1/block size; otherwise, the previous value is incremented by one. (A new value will also be chosen if incrementing would take the value above upper). The block size is thus (except for the truncation effect at upper) the expected size of the block. METHOD=CIRCULAR is similar to METHOD=OVERLAP except that any point in [lower,upper] can be selected as the start of a block—if the block hits upper, it wraps to continue with lower.

PANEL/[NOPANEL]

Use PANEL if you want to randomize entire individuals in a (balanced) panel data set. Within each individual, the original time order is maintained. Use the PANEL option on any subsequent SET instructions you use to get the shuffled data.

Notes

BOOT draws with replacement by scaling and translating uniform random numbers. A similar calculation can be done with FIX(%UNIFORM(lower,upper+1)). The +1 is needed because the upper bound is included as a possible value by BOOT.

Examples

calendar(q) 1980:1

allocate 2017:4

boot entries / 1980:1 2003:4

The dates of lower and upper correspond to the 1st and the 96th entry numbers, so BOOT fills the ENTRIES series with random integers ranging from 1 to 96.

@hurst(header="R/S Analysis of Equally-Weighted Returns") ew

boot(block=40,method=nooverlap) shuffle

set reshuffle = ew(shuffle(t))

@hurst(header="R/S Analysis of Block Shuffled Returns") reshuffle

runs the @HURST procedure on the series EW, then does a non-overlapping block reshuffling of its data and re-executes the procedure.

boot(noreplace) entry 1 50

set shuffle 1 50 = ressqr(entry(t))

creates SHUFFLE as a random reordering of the fifty elements of the series RESSQR.

BOOT is typically part of a program with a loop around the resampling operation. We'll give a couple of program fragments to demonstrate this.

This is part of the GARCHBOOT.RPF example which does an out-of-sample simulation of a GARCH model. The BOOT instruction draws entry numbers from the sample range (gstart to gend) for the forecast range (gend+1 to gend+span). The entry numbers generated (into the SERIES[INTEGER] named ENTRIES) is then used to pull out standardized sample residuals for building the GARCH process during the forecast period.

set ustandard gstart gend = u/sqrt(h)

* span is the number of periods over which returns are to be computed.

* ndraws is the number of bootstrapping draws

compute span=10

compute ndraws=10000

* Extend out the h series (values aren't important--this is just to get

* the extra space).

set h gend+1 gend+span = h(gend)

dec vect returns(ndraws)

do draw=1,ndraws

* This draws standardized u's from the ustandard series

boot entries gend+1 gend+span gstart gend

* Simulate the GARCH model out of sample, scaling up the standardized

* residuals by the square root of the current h.

set udraw gend+1 gend+span = ustandard(entries)

set u gend+1 gend+span = (h(t)=hf(t)),udraw(t)*sqrt(h(t))

* Figure out the cumulative return over the span. As written, this

* allows for the continuation of the sample mean return. If you want

* to look at zero mean returns, take the b0 out.

sstats gend+1 gend+span b0+u>>returns(draw)

end do draw

This is from the example GRANGERBOOTSTRAP.RPF. This does a parametric bootstrap for a two variable VAR. The BOOT randomly draws entry numbers from the estimation sample (%REGSTART() to %REGEND()) and builds a bootstrapped set of data using FORECAST with the PATHS option to input the resampled residuals.

set msample = fm1

set gsample = gdph

linreg(define=eqm) msample / rm

# constant msample{1 to 8}

linreg(define=eqg) gsample / rg

# constant gsample{1 to 8} msample{1 to 8}

group noncausal eqm eqg

compute ndraws=10000

set stats 1 ndraws = 0.0

do draw=1,ndraws

* Resample the residuals over the regression range

boot entries %regstart() %regend()

set rmsample = rm(entries(t))

set rgsample = rg(entries(t))

forecast(paths,model=noncausal,from=%regstart(),to=%regend(),results=results)

# rmsample rgsample

set msample = results(1)

set gsample = results(2)

* Do the causality test on the resampled data. Except for minor scale

* factors, this will give identical results to a full system LR test

* (under the assumptions of homoscedasticity).

linreg(noprint) msample

# constant msample{1 to 8} gsample{1 to 8}

exclude(noprint)

# gsample{1 to 8}

compute stats(draw)=%cdstat

end do draw