Ensemble and Combination Models' Methodologies

Questions and discussions on Time Series Analysis
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

Hi Tom,

Let's say there are multiple e.g. 2 models A and B generating forecasts with PI's. There are (probably at least) a couple of methods to 'join or merge or put together' the forecasts i.e. Ensemble or Combination.

For both of these: the aggregated point forecast's could be the mean's of the point forecasts, as a simple average e.g. (0.5*forecastA + 0.5*forecastB), but how to aggregate/mix each forecast(s) distributions, hence generate the aggregated/mixed PI's in RATS?

Enders (2014) AETS 4thEdn p109-112, discusses combining forecasts, however not with regard to PI’s.

Amarjit
TomDoan
Posts: 7779
Joined: Wed Nov 01, 2006 4:36 pm

Re: Ensemble and Combination Models' Methodologies

Unread post by TomDoan »

To get PI's, you would need a great deal more information, which often isn't available. You would not only need the variances of each set of forecasts, but also the covariances among them. And if you had that level of detail, then you wouldn't be taking simple averages, but would be taking weighted averages.
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Re: Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

Thanks.

I can follow Enders (2014) AETS 4thEdn p109-112, and the RATS code https://estima.com/textbooks/enders_4/enders4p111.rpf

I think this is referred to as Combination Method: taking a weighted average of the forecasts.

And thereafter account for the covariances between the forecast error distributions of the n individual models to calculate the Combination PI's, i.e. the Combined quantile forecasts.
TomDoan wrote:You would not only need the variances of each set of forecasts, but also the covariances among them.
But which variances are these? Are they squaring STDERRS from the FORECAST instruction?

How to calculate the covariances? Is this using VCV or CMOM on the list of forecast error (fn(t) - actual(t)) series?
TomDoan
Posts: 7779
Joined: Wed Nov 01, 2006 4:36 pm

Re: Ensemble and Combination Models' Methodologies

Unread post by TomDoan »

No. If you have the information, you would compute the complete covariance matrix with VCV applied to the forecast errors. You can't compute the variances one way and the covariances another as you could end up with a non-positive definite matrix.
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Re: Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

From the example above, the covariance/correlation matrix applied to the forecast errors is

Code: Select all

* covariance/correlation matrix
vcv(matrix=v)
# herrors
I can also include

Code: Select all

stderrs=stderrs(s)
in UFORECAST if needed.

As a check, the 'usual way' without the hash and %keys

Code: Select all

do time=2000:3,2012:4
   boxjenk(noprint,constant,define=ar7,ar=7)                spread * time-1
   boxjenk(noprint,constant,define=ar6,ar=6)                spread * time-1
   boxjenk(noprint,constant,define=ar2,ar=2)                spread * time-1
   boxjenk(noprint,constant,define=ar127,ar=||1,2,7||)      spread * time-1
   boxjenk(noprint,constant,define=ar1ma1,ar=1,ma=1)        spread * time-1
   boxjenk(noprint,constant,define=ar2ma1,ar=2,ma=1)        spread * time-1
   boxjenk(noprint,constant,define=ar2ma17,ar=2,ma=||1,7||) spread * time-1
   *
   uforecast(equation=ar7,errors=errors_ar7,stderrs=stderrs_ar7,static) fore_ar7 time time
   uforecast(equation=ar6,errors=errors_ar6,stderrs=stderrs_ar6,static) fore_ar6 time time
   uforecast(equation=ar2,errors=errors_ar2,stderrs=stderrs_ar2,static) fore_ar2 time time
   uforecast(equation=ar127,errors=errors_ar127,stderrs=stderrs_ar127,static) fore_ar127 time time
   uforecast(equation=ar1ma1,errors=errors_ar1ma1,stderrs=stderrs_ar1ma1,static) fore_ar1ma1 time time
   uforecast(equation=ar2ma1,errors=errors_ar2ma1,stderrs=stderrs_ar2ma1,static) fore_ar2ma1 time time
   uforecast(equation=ar2ma17,errors=errors_ar2ma17,stderrs=stderrs_ar2ma17,static) fore_ar2ma17 time time
end do

* covariance/correlation matrix
vcv(matrix=v)
# errors_ar7 errors_ar6 errors_ar2 errors_ar127 errors_ar1ma1 errors_ar2ma1 errors_ar2ma17
compute qbar=%cvtocorr(%sigma)
From the covariance/correlation matrix I can extract the variances on the diagonal

Code: Select all

comp d=%xdiag(v)
but how to extract the covariances below the diagonal and correlations above the diagonal?

Thus, how to proceed to calculate the Combination PI's?
TomDoan
Posts: 7779
Joined: Wed Nov 01, 2006 4:36 pm

Re: Ensemble and Combination Models' Methodologies

Unread post by TomDoan »

I'm not sure what the correlations have to do with anything, but you have an estimate of the covariance matrix of the forecast errors. The variance of a linear combination of those is x'Vx where V is the covariance matrix.
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Re: Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

TomDoan wrote:I'm not sure what the correlations have to do with anything
Yes, agreed, %SIGMA and the MATRIX option via the VCV instruction both give a variance/covariance matrix.
TomDoan wrote:The variance of a linear combination of those is x'Vx where V is the covariance matrix.
I think this is the correct way round, as per https://estima.com/tour/rats_basics__functions.shtml, resulting in a 50*50 non-symmetric matrix

Code: Select all

* create an array from the entries of data series
make xerrors 2000:03 2012:04
# herrors
disp xerrors

* display variance/covariance matrix
disp v

* linear combination x'vx
comp lc_xerrors = xerrors * v * tr(xerrors)
disp lc_xerrors
How to proceed... ?
TomDoan
Posts: 7779
Joined: Wed Nov 01, 2006 4:36 pm

Re: Ensemble and Combination Models' Methodologies

Unread post by TomDoan »

No. That's not at all correct. The code you were working with was creating a optimal (in sample) linear combination of the forecasts to create a single forecast, which will have a scalar variance.
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Re: Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

v is a 7*7 variance/covariance matrix i.e. there are 7 models, each of the 7 models having 50 one-step ahead OOS forecasts.

For a scalar variance x should be a 7*1 vector, thus x must be the optimal weights. Correct?
TomDoan
Posts: 7779
Joined: Wed Nov 01, 2006 4:36 pm

Re: Ensemble and Combination Models' Methodologies

Unread post by TomDoan »

Right. That's computing a single linear combination for the forecasts that applies at each time period. (It computes it by minimizing sum of squared errors across the training sample),.
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Re: Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

Code: Select all

* fweights
disp fweights

* weights via LQPROG
disp x

* combined forecasts using fweights
set fc_fweights 2000:03 2012:04 = (fweights(1)*fore(eqnkeys(1))) + $
                                  (fweights(2)*fore(eqnkeys(2))) + $
                                  (fweights(3)*fore(eqnkeys(3))) + $
                                  (fweights(4)*fore(eqnkeys(4))) + $
                                  (fweights(5)*fore(eqnkeys(5))) + $
                                  (fweights(6)*fore(eqnkeys(6))) + $
                                  (fweights(7)*fore(eqnkeys(7)))
prin 2000:03 2012:04 fc_fweights

* combined forecasts using LQPROG
set fc_LQPROG 2000:03 2012:04 = (x(1)*fore(eqnkeys(1))) + $
                                (x(2)*fore(eqnkeys(2))) + $
                                (x(3)*fore(eqnkeys(3))) + $
                                (x(4)*fore(eqnkeys(4))) + $
                                (x(5)*fore(eqnkeys(5))) + $
                                (x(6)*fore(eqnkeys(6))) + $
                                (x(7)*fore(eqnkeys(7)))
prin 2000:03 2012:04 fc_LQPROG


* display variance/covariance matrix
disp v

* scalar variance fweights linear combination fweights'vfweights
comp var_fweights = tr(fweights) * v * fweights
disp var_fweights
comp sigma_fweights = %sqrt(var_fweights)
disp sigma_fweights

* scalar variance LQPROG linear combination x'vx
comp var_LQPROG = tr(x) * v * x
disp var_LQPROG
comp sigma_LQPROG = %sqrt(var_LQPROG)
disp sigma_LQPROG
Using fweights the variance is 0.16694, and using the weights via LQPROG the variance 0.16281.

The combined forecasts vary across the 50 time-steps, but the square-root of the combined forecasts variance should be changing as-well, as in STDERRS from UFORECAST.

How to calculate the combined forecast SE's for each time period?
TomDoan
Posts: 7779
Joined: Wed Nov 01, 2006 4:36 pm

Re: Ensemble and Combination Models' Methodologies

Unread post by TomDoan »

First of all, those strike me as being trivially different---none of these are so accurate that a 1% difference in standard errors should be much to bother about. But the other is that you are trying to combine calculations which are done with very different attitudes towards model stability. Your forecasts and forecast errors are being generated using rolling regressions, but the "optimal" combinations of forecasts are being computing using a single optimization across all those forecast errors generated assuming a single model isn't adequate. The problem is that there is simply not enough information to do separate optimizations at each data point---you have, after all, only one actual observation at each entry, and this is generating seven separate forecasts for it. You would need to make heroic assumptions to generate a series of separate full rank covariance matrices at each data point, and the results probably wouldn't be much different in the end.
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Re: Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

Visually plotting the 2 combination forecasts vs. the 7 models forecasts results in smoother forecasts as one would expect.

The RMSE's OOS are:
ar7: 0.42758414
ar6: 0.42056211
ar2: 0.40841082
ar127: 0.40844838
ar1ma1: 0.40697306
ar2ma1: 0.41691831
ar2ma17: 0.40659699

fc_fweights: 0.40857999
fc_LQPROG: 0.40350209

The aim is to have combination models applicable for:
- one-step ahead forecasts
- multi-step ahead forecasts
mean forecasts and PI's:
- not only for OOS 1 to 50 points (as in this example)
- but also for TOOS (true-out-of-sample) i.e. from the 51st point onwards.

To simplify the task assume just 2 models A and B "equally" weighted, OOS and TOOS:
Assuming normality. The mean forecast is: (0.5*forecastA + 0.5*forecastB). And also independence of forecast distribution's A and B i.e. covariance=0 for each forecast point. Is the square-root of the variance not just the square-root of the sum of the square of STDERRS of both models A and B equally weighted: sqrt(0.5*STDERRSA^2 + 0.5*STDERRSB^2)? Thus, it is easy to generate the PI's. Or is that unrealistic?
TomDoan
Posts: 7779
Joined: Wed Nov 01, 2006 4:36 pm

Re: Ensemble and Combination Models' Methodologies

Unread post by TomDoan »

Didn't you just compute a covariance matrix of the forecast errors? Aren't they quite far from being "independent"? (I would think they might be >.90 correlated, aren't they?) So any calculation as if they are independent would be highly misleading. Note also that the weights for short-term forecasts are likely to be different from the weights for long-term forecasts, probably by quite a bit. (Simpler models often dominate in accuracy for longer range forecasts).
ac_1
Posts: 468
Joined: Thu Apr 15, 2010 6:30 am

Re: Ensemble and Combination Models' Methodologies

Unread post by ac_1 »

Yes, the forecast errors are highly correlated: 0.94798 to 0.99933 (and obviously 1). But I do not want to ‘fit’ to the OOS forecasts; and then there’s TOOS forecasts where forecast errors cannot be calculated - which is why I'd like to use the n component models forecast STDERRS.

As you say: The problem is that there is simply not enough information as I need to calculate covariance matrices at each data point OOS.

Back to the question: how to calculate Combination forecast PI's (equally weighted or optimized), OOS and TOOS taking into account the forecast errors, without 'fitting' to the OOS forecasts?
Post Reply