DMARIANO—Diebold-Mariano test
DMARIANO—Diebold-Mariano test
@DMARIANO is a procedure for computing Diebold-Mariano forecast comparison tests. A companion procedure for doing the Granger-Newbold forecast comparison test is at @GNEWBOLD procedure.
Detailed Description
Note: The Diebold-Mariano test should not be applied to situations where the competing models are nested (examples of nested models: AR(1) vs AR(2), no change vs any ARIMA(p,1,q)). An alternative testing procedure for those situations is provided at
clarkforetest.src
This implements the Clark-McCracken test from Clark, Todd. E., and Michael W. McCracken, 2001, "Tests of Equal Forecast Accuracy and Encompassing for Nested Models," Journal of Econometrics 105 (Nov.), pp. 85-110.
Detailed Description
Note: The Diebold-Mariano test should not be applied to situations where the competing models are nested (examples of nested models: AR(1) vs AR(2), no change vs any ARIMA(p,1,q)). An alternative testing procedure for those situations is provided at
clarkforetest.src
This implements the Clark-McCracken test from Clark, Todd. E., and Michael W. McCracken, 2001, "Tests of Equal Forecast Accuracy and Encompassing for Nested Models," Journal of Econometrics 105 (Nov.), pp. 85-110.
Re: DMARIANO - revision of Diebold-Mariano procedure
What are the reasons for this ?Note: The Diebold-Mariano test should not be applied to situations where the competing models are nested (examples of nested models: AR(1) vs AR(2), no change vs any ARIMA(p,1,q)).
Re: DMARIANO - revision of Diebold-Mariano procedure
The only interesting test in that case is the adequacy of the more restricted model. Under the null, the forecasts of the two models are asymptotically perfectly correlated, which causes the asymptotics of the test to collapse.
Re: DMARIANO - revision of Diebold-Mariano procedure
As Tom kindly and succinctly indicates, testing equal accuracy of forecasts from nested models involves some complexities relative to tests applied to forecasts from non-nested models or other sources. As Tom notes, the root of the problem is that, at the population level, if the null hypothesis is that the smaller model is the true one, the forecast errors from the competing models are exactly the same and perfectly correlated, which means that the numerator and denominator of a Diebold-Mariano test are each limiting to zero as the estimation sample and prediction sample grow.
That said, what has become clearer with recent research is that the choice of test statistic and source of critical values depends on what one wants to know. Are you interested in testing equal forecast accuracy at the population level, which is in turn a test of whether the small model is the true one? Or are you instead interested in testing equal accuracy in a finite sample? The former hypothesis is the world of the Clark-McCracken work Tom mentioned, under which the procedure he mentioned can be used to generate tests and critical values. The latter hypothesis is treated in more recent work by Clark and McCracken and by Giacomini and White (Econometrica, 2006). The latter form of hypothesis can be tested with bootstrap methods developed in the more recent work by C-M. Alternatively, as long as the forecasts are generated under the so-called rolling scheme, the latter hypothesis can be tested (with asymptotic justification) with a Diebold-Mariano statistic compared against standard normal critical values, as shown by Giacomini and White. If the forecasts are generated under a recursive scheme, the D-M test cannot be justified under the Giacomini-White asymptotics, but in simulated data, the D-M test performs even a bit better than it does under the rolling scheme.
With the intention of being helpful (as opposed to self-serving), I have attached a recent survey Mike McCracken and I wrote that describes this more recent work (and provides the detailed references) and offers some Monte Carlo evidence on the alternative inference approaches. Our conclusion is that, for testing equal accuracy in a finite sample, the proposed bootstrap is most accurate. Of course, it requires the coding of a bootstrap. A conventional Diebold-Mariano test has the advantage that it is simpler to obtain critical values, but the conventional test seems to be modestly less reliable. But at least the conventional D-M test is conservative when applied to short-horizon forecasts. If you conduct a DM test and find it rejects the small model in favor of the large, it is a good sign that the large model is truly more accurate in the finite sample.
That said, what has become clearer with recent research is that the choice of test statistic and source of critical values depends on what one wants to know. Are you interested in testing equal forecast accuracy at the population level, which is in turn a test of whether the small model is the true one? Or are you instead interested in testing equal accuracy in a finite sample? The former hypothesis is the world of the Clark-McCracken work Tom mentioned, under which the procedure he mentioned can be used to generate tests and critical values. The latter hypothesis is treated in more recent work by Clark and McCracken and by Giacomini and White (Econometrica, 2006). The latter form of hypothesis can be tested with bootstrap methods developed in the more recent work by C-M. Alternatively, as long as the forecasts are generated under the so-called rolling scheme, the latter hypothesis can be tested (with asymptotic justification) with a Diebold-Mariano statistic compared against standard normal critical values, as shown by Giacomini and White. If the forecasts are generated under a recursive scheme, the D-M test cannot be justified under the Giacomini-White asymptotics, but in simulated data, the D-M test performs even a bit better than it does under the rolling scheme.
With the intention of being helpful (as opposed to self-serving), I have attached a recent survey Mike McCracken and I wrote that describes this more recent work (and provides the detailed references) and offers some Monte Carlo evidence on the alternative inference approaches. Our conclusion is that, for testing equal accuracy in a finite sample, the proposed bootstrap is most accurate. Of course, it requires the coding of a bootstrap. A conventional Diebold-Mariano test has the advantage that it is simpler to obtain critical values, but the conventional test seems to be modestly less reliable. But at least the conventional D-M test is conservative when applied to short-horizon forecasts. If you conduct a DM test and find it rejects the small model in favor of the large, it is a good sign that the large model is truly more accurate in the finite sample.
- Attachments
-
- 14 Clark and McCracken Chapter.pdf
- (465.76 KiB) Downloaded 1891 times
Todd Clark
Economic Research Dept.
Federal Reserve Bank of Cleveland
Economic Research Dept.
Federal Reserve Bank of Cleveland
Re: DMARIANO - revision of Diebold-Mariano procedure
The adjustment suggested by Harvey, et al is a small-sample adjustment to the autocorrelation-consistent estimate of the variance entering the DM test. In the notation to which you referred, n refers to the number of forecast observations, and h refers to the forecast horizon (1 for 1-step ahead, 2 for 2-step ahead, etc.).
Note that the Harvey, et al adjustment is only appropriate if the variance is estimated with the so-called "truncated" estimator of the variance (in the procedure Tom Doan put together, you would use the option lwindow=truncated).
Note that the Harvey, et al adjustment is only appropriate if the variance is estimated with the so-called "truncated" estimator of the variance (in the procedure Tom Doan put together, you would use the option lwindow=truncated).
Todd Clark
Economic Research Dept.
Federal Reserve Bank of Cleveland
Economic Research Dept.
Federal Reserve Bank of Cleveland
Re: DMARIANO - revision of Diebold-Mariano procedure
You shouldn't have to look up critical values---the whole point of using the %ttest function is to get the significance level so you don't have to check tables. However, that looks like a typo, since the other numbers are +/-10.xxxx, not +/- 1.xxxx.test stat P(DM>X)
-1.0220 0.00004
1.0220 0.99996
here i accept the H0 coz the 1.0220 is smaller than critical value of 1.645 at 10% significant level.
If you're asking which you want, it's the first. Positive numbers cast doubt on a set of forecasts. The first of these will clearly indicate that the first set of forecasts is better than the second. The null in the first line is that f1=f2 vs f2 better than f1; and the null in the second is that f1=f2 vs f1 better than f2.test stat P(DM>X)
-10.0220 0.99996
10.0220 0.00004
or
test stat P(DM>X)
-10.0220 0.00004
10.0220 0.99996
Re: DMARIANO - revision of Diebold-Mariano procedure
Thanks, Tom
As you said ‘If you're asking which you want, it's the first. Positive numbers cast doubt on a set of forecasts. The first of these will clearly indicate that the first set of forecasts is better than the second. ’
Does it mean we only check out the forecast that produces negative test stats? Given negative test stats, when the P value is larger than 95% or less than 5%, we accept H1; when it is out of the two ranges, we fail to reject H0?
As you said ‘If you're asking which you want, it's the first. Positive numbers cast doubt on a set of forecasts. The first of these will clearly indicate that the first set of forecasts is better than the second. ’
Does it mean we only check out the forecast that produces negative test stats? Given negative test stats, when the P value is larger than 95% or less than 5%, we accept H1; when it is out of the two ranges, we fail to reject H0?
Re: DMARIANO - revision of Diebold-Mariano procedure
What you have are two separate one-tailed tests, not one two-tailed test. The only case where you might reject one forecasting procedure in favor of the other is where the difference in the test statistics is positive; where it's negative, you're in the wrong tail to reject.
Re: DMARIANO - revision of Diebold-Mariano procedure
Tom, I have a question about how to interpret the results of Diebold and Mariano test.
Diebold-Mariano Forecast Comparison Test
Forecasts of X1 over 2006:01 to 2010:12
Forecast MSE Test Stat P(DM>x)
NAIVE1 5.66636695 1.0928 0.13724
M1P1 3.94101250 -1.0928 0.86276
Here in the first line H0) NAIVE1 = M1P1 and H1) M1P1 Better than NAIVE1.
Therefore accept the hypothesis that M1P1 is better than NAIVE1.
But in the second line of the test is: :
H0) NAIVE1 = M1P1 and H1) NAIVE1 Better Than M1P1.
Therefore accept the hypothesis that NAIVE1 is better than M1P1.
Given this, how can I interpret what is the best result to say that one forecast are better than other.
Regards
W
Diebold-Mariano Forecast Comparison Test
Forecasts of X1 over 2006:01 to 2010:12
Forecast MSE Test Stat P(DM>x)
NAIVE1 5.66636695 1.0928 0.13724
M1P1 3.94101250 -1.0928 0.86276
Here in the first line H0) NAIVE1 = M1P1 and H1) M1P1 Better than NAIVE1.
Therefore accept the hypothesis that M1P1 is better than NAIVE1.
But in the second line of the test is: :
H0) NAIVE1 = M1P1 and H1) NAIVE1 Better Than M1P1.
Therefore accept the hypothesis that NAIVE1 is better than M1P1.
Given this, how can I interpret what is the best result to say that one forecast are better than other.
Regards
W
Re: DMARIANO - revision of Diebold-Mariano procedure
Not at conventional significance levels, but there's certainly some evidence of it.TWG wrote:Tom, I have a question about how to interpret the results of Diebold and Mariano test.
Diebold-Mariano Forecast Comparison Test
Forecasts of X1 over 2006:01 to 2010:12
Forecast MSE Test Stat P(DM>x)
NAIVE1 5.66636695 1.0928 0.13724
M1P1 3.94101250 -1.0928 0.86276
Here in the first line H0) NAIVE1 = M1P1 and H1) M1P1 Better than NAIVE1.
Therefore accept the hypothesis that M1P1 is better than NAIVE1.
TWG wrote: But in the second line of the test is: :
H0) NAIVE1 = M1P1 and H1) NAIVE1 Better Than M1P1.
Therefore accept the hypothesis that NAIVE1 is better than M1P1.
Why would you reject H0? The p-value is huge.
TWG wrote: Given this, how can I interpret what is the best result to say that one forecast are better than other.
Regards
W
Re: DMARIANO - revision of Diebold-Mariano procedure
Tom, thanks very much for the quick reply. Another more basic question. When I compare forecast for 12 steps ahead. The correct is to put lags, like this
@dmariano(lags=11) GDP naive12 M1P12
Or this is not necessary?
@dmariano(lags=11) GDP naive12 M1P12
Or this is not necessary?
Re: DMARIANO - revision of Diebold-Mariano procedure
The proper use of this is with LAGS=11 for 12 step ahead forecasts.TWG wrote:Tom, thanks very much for the quick reply. Another more basic question. When I compare forecast for 12 steps ahead. The correct is to put lags, like this
@dmariano(lags=11) GDP naive12 M1P12
Or this is not necessary?
Re: DMARIANO - revision of Diebold-Mariano procedure
Tom, for the Granger-Newbold forecast comparison test, for compare forecast 12 steps ahead, the syntax is the same?.
@GNewbold(lags=11) GDP naive12 M1P12
Regards
W
@GNewbold(lags=11) GDP naive12 M1P12
Regards
W
Re: DMARIANO - revision of Diebold-Mariano procedure
No. The Granger-Newbold test doesn't allow for serial correlation in the forecast errors.TWG wrote:Tom, for the Granger-Newbold forecast comparison test, for compare forecast 12 steps ahead, the syntax is the same?.
@GNewbold(lags=11) GDP naive12 M1P12
Regards
W
Re: DMARIANO - Diebold-Mariano test (revised)
Are there any examples of using the clarforetest.src procedure? I'm trying to it unsuccessfully. I want to compare two nested models. The first equation is a random walk, while the second is VECM with a cointegration vector of [1,-1].
When I do:
I get the error message "## SR3. Tried to Use Series Number 142, Only 124 Are Available".
In fact, looking at the procedure, I think the procedure does the forecasting itself, while I already have the forecasts I'm interested in comparing. Is there a simpler way to obtain the Clark and McCraken (2001) test statistic along the lines of what is done in the DMARIANO.SRC procudure by just inputting the actual and the 2 forecast series?
When I do:
Code: Select all
@forecastproc(scheme=2) 2002:01 libor1mus
#libor1mus{1}
#constant libor1mus-emratentl dlibor1mus{1 to 3} demratentl{1 to 3}
In fact, looking at the procedure, I think the procedure does the forecasting itself, while I already have the forecasts I'm interested in comparing. Is there a simpler way to obtain the Clark and McCraken (2001) test statistic along the lines of what is done in the DMARIANO.SRC procudure by just inputting the actual and the 2 forecast series?