Statistics and Algorithms / Structural Breaks and Switching Models /

Rolling Sample Estimation

Running a sequence of estimations over a moving window of data is very easy to do in RATS using a DO loop. The BASICFORECAST.RPF example demonstrates several ways to do forecasting with rolling regressions. Here, we’ll look in more detail at the tools available and the issues that can arise.

Loops and the @ROLLREG Procedure

Rolling regressions typically take one of three forms:

1.estimations done using a moving (fixed-size) window of data

2.estimations where one or more data points are added at the end of the sample

3.estimations where the starting period is incremented but the ending period remains fixed (so that the sample size decreases).

Recursive least squares is an example of (2). The underlying assumption for this is that the model is fixed, but we are attempting to simulate how estimates change as we add data. If the changes are large enough, it may cause us to question the assumption of a fixed model. The most commonly used of these, however, is the moving fixed window (1). This isn’t so much a test of a structural break as an admission that no single specification is likely to be valid across the full data set. There’s no model of structural change under which this technique is exact, but it can be approximately justified by an assumption that the change in the model takes place slowly. Note that while moving fixed-width analysis is increasingly common, it often is poorly-motivated, particularly when applied to rolling window hypothesis tests. ROLLINGCAUSALITY.RPF shows how use of a rolling window can easily produce incorrect inferences.

Any of these can be done using simple loops. For example, the first of these does moving windows of width 120, with samples ending in 1999:12 through 2017:12. The second fixes the start of a sample at 1990:1, then moves the end period from 1999:12 to 2017:12. The first regression is the same in both cases; the difference is that the moving window shifts the start point of the range along with the end point.

Moving Window

compute width=120

do end=1999:12,2017:12

linreg(noprint) depvar end-width+1 end

# regressors

end do

Moving end period

compute start=1990:1

do end=1999:12,2017:12

linreg(noprint) depvar start end

# regressors

end do

The Help>Pseudo-Code Generator>Rolling Analysis wizard can help set up the overall structure of the first two types of loops. (The third type is rarely used.)

In addition to the loop and the estimation instructions, you would normally need to add some bookkeeping instructions to collect the desired data. For least squares rolling regressions, we recommend using the @ROLLREG procedure, originally written by Simon van Norden and others at the Bank of Canada. It can do all three types of rolling regression, and has options for saving the coefficients, coefficient standard errors, and regression standard errors from each estimation, as well as options for graphing the coefficients over time, estimating with robust standard errors, and more. For example:

@rollreg(graph,move=32) y

# constant x2 x3

does a rolling regression with a moving window of 32 observations, and produces three graphs showing the evolution of each of the three coefficients with standard error bands. To collect the three graphs into a single-page SPGRAPH, you could do:

spgraph(vfieldfs=3,header="Moving Window Estimates (width=32)")

@rollreg(graph,move=32) y

# constant x2 x3

spgraph(done)

Rolling Nonlinear Estimations

For simple non-linear estimations that converge easily, such as those performed using NLLS and BOXJENK, the process for doing a rolling regression is essentially the same as for linear regressions. That is, just enclose the nonlinear estimation inside a loop that controls the starting and/or ending period(s) of the estimation, and add whatever bookkeeping instructions are necessary to collect the desired information. For instance, in ARIMA.RPF, we have the following, which does rolling BOXJENK instructions (adding to the end of the sample) on two models, and computes the one-step forecasts.

do time=1995:4,2008:1

boxjenk(noprint,constant,define=ar7eq,ar=7) spread * time-1

boxjenk(noprint,constant,define=ar2ma17eq,ar=2,ma=||1,7||) $

spread * time-1

uforecast(equation=ar7eq,static) forecast_ar7 time time

uforecast(equation=ar2ma17eq,static) forecast_ar2ma17 time time

end do

More care is required for cases where the estimation may be sensitive to initial parameter values, or where the validity of the standard errors depends on how the non-linear estimation process converges (for instance in the BFGS algorithm). If only the point estimates are important, you don’t have to worry about problems with standard errors, but it will still help to speed up the calculations by feeding in already converged estimates. If you’re using an instruction which uses PARMSET’s, that will already happen if you don’t repeat your guess value calculations inside the loop. With an instruction like GARCH, you can use the INITIAL option. With either type of instruction, you can use the HESSIAN option to feed in the inverse Hessian to initialize BFGS.

Rolling GARCH Estimates: GARCHBACKTEST.RPF example

GARCHBACKTEST.RPF is an example of rolling GARCH estimates. It uses a rolling set of estimates to produce a one-step out-of-sample estimate of the Value-at-Risk (VaR) which is used to backtest the validity of this particular VaR calculation. GARCH models can be particularly hard to handle with rolling estimation windows because they are, by their nature, designed to apply to data sets with "outliers" (unexpectedly large residuals), modelling how those tend to cluster together. If your sample moves from a period of relatively quiet data to one with much higher variability, it will be difficult for the model estimates to deal with the transition because the only data points which actually provide much information about the "GARCH" variance parameters are the ones after the large residuals, so you will be estimating those with (in effect) just a few data points.