Out-of-sample forecasts
Out-of-sample forecasts
Hi Tom,
I am having trouble using SMPL with FORECAST. I am trying to do "pretend" out-of-sample forecasts within the sample I have used to do ESTIMATE, in order to backward-test the model. However, the forecasts results seem too good to be true, so I am suspecting that post-sample values, which should not be used, are in fact used to compute the forecast. Here is simplified example of my programme:
1) I first do OPEN DATA, ALLOCATE and ESTIMATE over the full sample size (1959:1 to 2009:6)
2) SMPL 1959:1 2007:12
3) FORECAST(from=2008:1, to=2008:12)
I was hoping this would give me the forecasts that my model would have made back in December 2007. However as I said the results seem to incorporate post 2007 data. Any thought on this ?
Thanks a lot,
Raphael
I am having trouble using SMPL with FORECAST. I am trying to do "pretend" out-of-sample forecasts within the sample I have used to do ESTIMATE, in order to backward-test the model. However, the forecasts results seem too good to be true, so I am suspecting that post-sample values, which should not be used, are in fact used to compute the forecast. Here is simplified example of my programme:
1) I first do OPEN DATA, ALLOCATE and ESTIMATE over the full sample size (1959:1 to 2009:6)
2) SMPL 1959:1 2007:12
3) FORECAST(from=2008:1, to=2008:12)
I was hoping this would give me the forecasts that my model would have made back in December 2007. However as I said the results seem to incorporate post 2007 data. Any thought on this ?
Thanks a lot,
Raphael
Re: Out-of-sample forecasts
I'm not sure what you expected this code to do, but it is definitely not going to give you forecasts based only on data through 2007.
You probably wanted to put the SMPL instruction before the estimation step, rather than after.
Note that this:
"1) I first do OPEN DATA, ALLOCATE and ESTIMATE over the full sample size (1959:1 to 2009:6)"
indicates that you estimated the model using data through 2009:6, so that's what your forecasts are going to be based on. That's definitely not going to give you the same result as if you had estimated the model using only data through December of 2007.
If you do this after the estimation:
2) SMPL 1959:1 2007:12
3) FORECAST(from=2008:1, to=2008:12)
the SMPL isn't really having any effect. You can use SMPL to control the forecast range, but here you are using the FROM and TO options to specify the range. That's going to override any SMPL setting.
Sounds like you want to limit your estimation period to end in 2007:12. You could do that using the START and END parameters on your estimation instruction, or by doing:
SMPL 1959:1 2007:12
or
SMPL * 2007:12
before the estimation instruction.
You probably wanted to put the SMPL instruction before the estimation step, rather than after.
Note that this:
"1) I first do OPEN DATA, ALLOCATE and ESTIMATE over the full sample size (1959:1 to 2009:6)"
indicates that you estimated the model using data through 2009:6, so that's what your forecasts are going to be based on. That's definitely not going to give you the same result as if you had estimated the model using only data through December of 2007.
If you do this after the estimation:
2) SMPL 1959:1 2007:12
3) FORECAST(from=2008:1, to=2008:12)
the SMPL isn't really having any effect. You can use SMPL to control the forecast range, but here you are using the FROM and TO options to specify the range. That's going to override any SMPL setting.
Sounds like you want to limit your estimation period to end in 2007:12. You could do that using the START and END parameters on your estimation instruction, or by doing:
SMPL 1959:1 2007:12
or
SMPL * 2007:12
before the estimation instruction.
Re: Out-of-sample forecasts
Sorry I did not manage to explain what I meant to ask the first time, I'll try again...
Let's say my VAR is y(t) = b.y(t-1) + u(t). I do actually want to estimate the model parameters (the betas of the equations) over the full sample.
However, when computing the forecast for 2008, ie. ^y(2008), I want to make sure that no y(t>2007:12) are used. Only y(t<2008) or ^y(t>2007) (estimates of y) should be used in computing ^y. Can SMPL be used in that purpose ?
Hope I managed to be clearer this time...
Thanks for your help.
Let's say my VAR is y(t) = b.y(t-1) + u(t). I do actually want to estimate the model parameters (the betas of the equations) over the full sample.
However, when computing the forecast for 2008, ie. ^y(2008), I want to make sure that no y(t>2007:12) are used. Only y(t<2008) or ^y(t>2007) (estimates of y) should be used in computing ^y. Can SMPL be used in that purpose ?
Hope I managed to be clearer this time...
Thanks for your help.
Re: Out-of-sample forecasts
As I noted above, you can use an SMPL instruction to control the forecast range, but it's usually not the best option, since it will also change the default range used by any other subsequent instruction.
If you simply do this:
FORECAST(from=2008:1, to=2008:12)
without an SMPL instruction, you will get dynamic forecasts starting in 2008. Assuming your model is a pure VAR with no exogenous variables (other than a constant), no actual values later than 2007:12 will be used in computing the forecasts.
If you simply do this:
FORECAST(from=2008:1, to=2008:12)
without an SMPL instruction, you will get dynamic forecasts starting in 2008. Assuming your model is a pure VAR with no exogenous variables (other than a constant), no actual values later than 2007:12 will be used in computing the forecasts.