Model of the US Economy: CointegratedVARModelHandbook
Model of the US Economy: CointegratedVARModelHandbook
Hi Tom,
I'd like to build a model of the US Economy using the methodology as implemented: Handbook Cointegrated VAR Model,
https://estima.com/catsinfo.shtml, but unfortunately as I do not have CATS to test for cointegration I can use: https://estima.com/webhelp/topics/johmleprocedure.html, instead?
The aim being: quarterly (updated monthly) forecasts with fan charts and bootstrapped, for k variables representative of the US economy as those used by the MPC and/or chosen from https://tradingeconomics.com/united-states/indicators
(i) The choice of variables k is all-important (there's no theory) - any advice on those e.g. how many and which ones?
(ii) Largest lag length can be calculated from @VARLagSelect.
Using @BJTRANS, and various tests for (non-)stationarity @DFUNIT and @KPSS, I may get a system which will be a mixture of each variable having different transformations: none/log/sqrt, and some variables being I(1) others I(0).
(a) Does it make sense to transform e.g. FEDFUNDS using sqrt?
(b) Can I use I(1) and I(0) variables and model as a cointegrated VAR i.e. VECM? How would I(0) variables be handled in the modelling procedure e.g. do I include the I(0) with the I(1) in @JOHMLE, what about in SYSTEM?
(c) Can I calculate the inverse roots of the characteristic equations for a VAR and VECM, as in univariate ARIMA modelling?
(d) What do the eigenvalues represent in a cointegrated VAR? Do they represent the equivalent of (inverse) roots as in the univariate case?
(e) In RATS to compute fitted values I can SET fitted(k) = series(k) - resids(k) for each of the k variables, but why is PRJ not applicable to SYSTEM?
thanks,
Amarjit
I'd like to build a model of the US Economy using the methodology as implemented: Handbook Cointegrated VAR Model,
https://estima.com/catsinfo.shtml, but unfortunately as I do not have CATS to test for cointegration I can use: https://estima.com/webhelp/topics/johmleprocedure.html, instead?
The aim being: quarterly (updated monthly) forecasts with fan charts and bootstrapped, for k variables representative of the US economy as those used by the MPC and/or chosen from https://tradingeconomics.com/united-states/indicators
(i) The choice of variables k is all-important (there's no theory) - any advice on those e.g. how many and which ones?
(ii) Largest lag length can be calculated from @VARLagSelect.
Using @BJTRANS, and various tests for (non-)stationarity @DFUNIT and @KPSS, I may get a system which will be a mixture of each variable having different transformations: none/log/sqrt, and some variables being I(1) others I(0).
(a) Does it make sense to transform e.g. FEDFUNDS using sqrt?
(b) Can I use I(1) and I(0) variables and model as a cointegrated VAR i.e. VECM? How would I(0) variables be handled in the modelling procedure e.g. do I include the I(0) with the I(1) in @JOHMLE, what about in SYSTEM?
(c) Can I calculate the inverse roots of the characteristic equations for a VAR and VECM, as in univariate ARIMA modelling?
(d) What do the eigenvalues represent in a cointegrated VAR? Do they represent the equivalent of (inverse) roots as in the univariate case?
(e) In RATS to compute fitted values I can SET fitted(k) = series(k) - resids(k) for each of the k variables, but why is PRJ not applicable to SYSTEM?
thanks,
Amarjit
Re: Model of the US Economy: CointegratedVARModelHandbook
Regarding the roots, you already asked about that earlier this year: https://www.estima.com/forum/viewtopic. ... 143#p19143. It didn't make sense then, and still doesn't.
Obviously, the first thing you have to do is determine which variables are of particular interest. You are probably going to find a broader set of alternative models in the VAR/Bayesian VAR literature than in the cointegration literature. If you are not doing a Bayesian model, probably six to eight will be the practical limit on the number.
You can mix I(0) and I(1) in a model. An I(0) variable is "cointegrated with itself" so it increases the cointegrating rank by 1.
Preliminary transformations should be made based upon how the series fits into the model as a whole. (That's discussed on the first full page of the VAR chapter in the User's Guide).
PRJ is an explicitly univariate instruction which does quite a few explicitly univariate calculations. For a system, you can do FORECAST(STATIC) if you want the in-sample fitted values.
Obviously, the first thing you have to do is determine which variables are of particular interest. You are probably going to find a broader set of alternative models in the VAR/Bayesian VAR literature than in the cointegration literature. If you are not doing a Bayesian model, probably six to eight will be the practical limit on the number.
You can mix I(0) and I(1) in a model. An I(0) variable is "cointegrated with itself" so it increases the cointegrating rank by 1.
Preliminary transformations should be made based upon how the series fits into the model as a whole. (That's discussed on the first full page of the VAR chapter in the User's Guide).
PRJ is an explicitly univariate instruction which does quite a few explicitly univariate calculations. For a system, you can do FORECAST(STATIC) if you want the in-sample fitted values.
Re: Model of the US Economy: CointegratedVARModelHandbook
The confusion is between the e-values for the inverse roots, and those in the Johansen Procedure (EigVal).TomDoan wrote: ↑Tue Sep 17, 2024 1:09 pm Regarding the roots, you already asked about that earlier this year: https://www.estima.com/forum/viewtopic. ... 143#p19143. It didn't make sense then, and still doesn't.
Until now I have chosen five variables. I'd like to try Bayesian estimation as-well - appreciate any recommendations for literature where computations can efficiently be handled in RATS.TomDoan wrote: ↑Tue Sep 17, 2024 1:09 pm Obviously, the first thing you have to do is determine which variables are of particular interest. You are probably going to find a broader set of alternative models in the VAR/Bayesian VAR literature than in the cointegration literature. If you are not doing a Bayesian model, probably six to eight will be the practical limit on the number.
Does that mean in:
@JohMLE I use only the I(1) variables,
and then in
SYSTEM()
VARIABLES include the I(0) variables, along with the I(1)?
The series chosen, thus far, are (by-eye) non-trending, even though the (non-)stationarity @DFUNIT and @KPSS tests say I have a mixture of I(1) and I(0) series. Also, I have used no transformations based on UG-208 and not @BJTRANS.
Re: Model of the US Economy: CointegratedVARModelHandbook
They are completely different calculations. Again, you seemed to have learned something very wrong about roots.ac_1 wrote: ↑Thu Sep 19, 2024 6:19 amThe confusion is between the e-values for the inverse roots, and those in the Johansen Procedure (EigVal).TomDoan wrote: ↑Tue Sep 17, 2024 1:09 pm Regarding the roots, you already asked about that earlier this year: https://www.estima.com/forum/viewtopic. ... 143#p19143. It didn't make sense then, and still doesn't.
See the GIBBSVARBUILD.RPF example.ac_1 wrote: ↑Thu Sep 19, 2024 6:19 amUntil now I have chosen five variables. I'd like to try Bayesian estimation as-well - appreciate any recommendations for literature where computations can efficiently be handled in RATS.TomDoan wrote: ↑Tue Sep 17, 2024 1:09 pm Obviously, the first thing you have to do is determine which variables are of particular interest. You are probably going to find a broader set of alternative models in the VAR/Bayesian VAR literature than in the cointegration literature. If you are not doing a Bayesian model, probably six to eight will be the practical limit on the number.
No. It means what I said it means. If you include I(0) variables in a cointegration test, each one increases the cointegrating rank by 1.
You're doing a model of the US Economy without any trending variables or variable that require log transformations? What are you using?
Re: Model of the US Economy: CointegratedVARModelHandbook
Sorry, I don't understand. What variables: I(1), I(0), do I use in @JohMLE? And how do I specify all the variables in SYSTEM(); VARIABLES?
Variables thus far, non-trending (by-eye): A measure of gdp growth, a couple of y/y inflation measures, an interest rate, ... , any other suggestions?
Variables thus far, non-trending (by-eye): A measure of gdp growth, a couple of y/y inflation measures, an interest rate, ... , any other suggestions?
Re: Model of the US Economy: CointegratedVARModelHandbook
Read the 4th paragraph ("Note that...") on https://estima.com/webhelp/topics/testing-cointegration.html.
I'm not sure why you would be looking at the Juselius book for what you are trying to do. Her example is an examination of money demand: real money, real GDP, inflation and two interest rates. It's not designed to predict anything but especially not GDP.
I'm not sure why you would be looking at the Juselius book for what you are trying to do. Her example is an examination of money demand: real money, real GDP, inflation and two interest rates. It's not designed to predict anything but especially not GDP.
Re: Model of the US Economy: CointegratedVARModelHandbook
I'm inclined to select variables that are relevant to the markets e.g. one is interested in GDP Growth Rate QoQ (with fans and bootstrapped),
rather than GDP Price Index QoQ. On plotting the Danish data in level in CointegratedVARModelHandbook the obvious question is:
do multivariate models perform better with a mixture of trending and non-trending series, or only trending, or just non-trending?
Also, a multivariate model may not be able to compete with a univariate model w.r.t. IID residuals - and on "exploratory analysis"
with series I cannot flatten all the ACF's. In dieb3p265.rpf the Starts and Completions look very similar to each other, and both the
residual correlograms are satisfactory. My aim leads to a more complex problem. But multivariate models are useful for analyzing the
interrelationships b/w series: thus far IRF's look very good, like these https://estima.com/webhelp/topics/impulsesrpf.html.
The point I am making is the choice of series could be "a fit", and therefore poor TOOS forecasts.
(i) With I(0) variables it is not spurious to calculate correlations b/w I(0) vs. I(0), but as I have mostly 1(1) variables, is it wrong to
calculate correlations b/w I(1) vs. I(1), and I(1) vs. I(0)? The reason being I have read: opt for only variables that are correlated (or interrelated)
as they can provide mutual predictive value.
(ii) A VECM(11) with all EC terms is equivalent to a VAR(12) in levels. Hence, a VECM with zero lagged differences and all EC terms is equivalent to a VAR(1) in levels, what do I list in LAGS in SYSTEM?
(iii) I'd also like to implement regularization techniques w.r.t VARS: some zero coeffs, reducing the magnitude/size of the coeffs; and Bayesian gibbsvarbuild.rpf.
rather than GDP Price Index QoQ. On plotting the Danish data in level in CointegratedVARModelHandbook the obvious question is:
do multivariate models perform better with a mixture of trending and non-trending series, or only trending, or just non-trending?
Also, a multivariate model may not be able to compete with a univariate model w.r.t. IID residuals - and on "exploratory analysis"
with series I cannot flatten all the ACF's. In dieb3p265.rpf the Starts and Completions look very similar to each other, and both the
residual correlograms are satisfactory. My aim leads to a more complex problem. But multivariate models are useful for analyzing the
interrelationships b/w series: thus far IRF's look very good, like these https://estima.com/webhelp/topics/impulsesrpf.html.
The point I am making is the choice of series could be "a fit", and therefore poor TOOS forecasts.
(i) With I(0) variables it is not spurious to calculate correlations b/w I(0) vs. I(0), but as I have mostly 1(1) variables, is it wrong to
calculate correlations b/w I(1) vs. I(1), and I(1) vs. I(0)? The reason being I have read: opt for only variables that are correlated (or interrelated)
as they can provide mutual predictive value.
(ii) A VECM(11) with all EC terms is equivalent to a VAR(12) in levels. Hence, a VECM with zero lagged differences and all EC terms is equivalent to a VAR(1) in levels, what do I list in LAGS in SYSTEM?
(iii) I'd also like to implement regularization techniques w.r.t VARS: some zero coeffs, reducing the magnitude/size of the coeffs; and Bayesian gibbsvarbuild.rpf.
Re: Model of the US Economy: CointegratedVARModelHandbook
Correlations? I have no idea what you mean. Static correlations make no sense at all and are not useful in what you are doing. That's the spurious regressions result---the statistical correlation between even uncorrelated I(1) variables can be almost anything.ac_1 wrote: ↑Thu Sep 26, 2024 2:37 pm (i) With I(0) variables it is not spurious to calculate correlations b/w I(0) vs. I(0), but as I have mostly 1(1) variables, is it wrong to
calculate correlations b/w I(1) vs. I(1), and I(1) vs. I(0)? The reason being I have read: opt for only variables that are correlated (or interrelated)
as they can provide mutual predictive value.
For a VECM, the LAGS shows the lags in the overall VAR, so LAGS 1.
Re: Model of the US Economy: CointegratedVARModelHandbook
Have you looked at the section on VAR's in the User's Guide (or help---basically the same thing now)? Textbook examinations of VAR's are generally fairly limited as they typically are just making known that VAR's exist rather than actually doing anything significant. The VAR's that are used in the manual are intended as serious examples and are either taken from or based upon serious examples.
Re: Model of the US Economy: CointegratedVARModelHandbook
varlag.rpf, the residual ACF's using @REGCORRS are very good: flat, except for logi (5th lag), logm2 (4th lag).TomDoan wrote: ↑Fri Sep 27, 2024 1:04 pm Have you looked at the section on VAR's in the User's Guide (or help---basically the same thing now)? Textbook examinations of VAR's are generally fairly limited as they typically are just making known that VAR's exist rather than actually doing anything significant. The VAR's that are used in the manual are intended as serious examples and are either taken from or based upon serious examples.
I'll look at the others.
In a quarterly VAR with macro variables is there a potential for 'look-ahead bias' if I was to update the model at a monthly frequency i.e. as soon as data is released? For example, GDP is released quarterly at the end of the month (then the growth is calculated from the index), with monthly revisions; and inflation (calculated from the index) is typically in the middle of every month. I am thinking it would be safer to update just at the end of every quarter. Is that reasonable?
Re: Model of the US Economy: CointegratedVARModelHandbook
I'm not sure I understand the concern about "bias". You fit the model based upon historical information using complete quarters. If you want to forecast Q4 before you have all the Q3 data, you're just doing a form of conditional forecasting; conditional on the observed values of the data which are available.
Re: Model of the US Economy: CointegratedVARModelHandbook
In a static one-step ahead do loop, using the Trace tests @johMLE suggests a cointegrating rank equalling 2, but as I progress through the quarters that increases to 3. Also, I have to include 1 extra ect for the I(0) variable - right?
In SYSTEM how do I dynamically change the ect variables selected for the ectmodel within the loop?
And which 3 (or 4) ect's do I choose from the 5 ect's?
In SYSTEM how do I dynamically change the ect variables selected for the ectmodel within the loop?
And which 3 (or 4) ect's do I choose from the 5 ect's?
Re: Model of the US Economy: CointegratedVARModelHandbook
You're basing a forecasting model on what you gleaned from a 400+ page book which mentions "forecast" only a few times, mainly to indicate how certain models would create *bad* forecasts. What you are proposing to do seems like a really bad idea. If you are going to do that (strongly advise against), use the rank from the full data set, where you would have the most power.
The +1 due to a stationary variable will be reflected in the rank that you get from the test procedure.
If you decide the cointegrating rank is 3, you take the eigenvectors from the largest 3 eigenvalues. If it's 4, you take the eigenvectors from the largest 4 eigenvalues.
You can re-do the SYSTEM definition inside the loop.
The +1 due to a stationary variable will be reflected in the rank that you get from the test procedure.
If you decide the cointegrating rank is 3, you take the eigenvectors from the largest 3 eigenvalues. If it's 4, you take the eigenvectors from the largest 4 eigenvalues.
You can re-do the SYSTEM definition inside the loop.
Re: Model of the US Economy: CointegratedVARModelHandbook
Isn't that looking forward? - and in backtesting one should never look forward. Appreciated I am backtesting with data I have already seen, but I shouldn't know the rank from the full data set as I traverse through the loop, except for the last one.TomDoan wrote: ↑Thu Oct 10, 2024 12:47 pm You're basing a forecasting model on what you gleaned from a 400+ page book which mentions "forecast" only a few times, mainly to indicate how certain models would create *bad* forecasts. What you are proposing to do seems like a really bad idea. If you are going to do that (strongly advise against), use the rank from the full data set, where you would have the most power.
Thanks.
And the eigenvalues would correspond to the ect*'s in their order e.g. from ECT.RPF
@johmle(lags=6,det=rc,vectors=cvectors)
# ftbs3 ftb12 fcm7
equation(coeffs=%xcol(cvectors,1)) ect1 *
# ftbs3 ftb12 fcm7 constant
equation(coeffs=%xcol(cvectors,2)) ect2 *
# ftbs3 ftb12 fcm7 constant
equation(coeffs=%xcol(cvectors,3)) ect3 *
# ftbs3 ftb12 fcm7 constant
ect1: 1st largest eigenvalue
ect2: 2nd largest eigenvalue
ect3: 3rd largest eigenvalue
(a) That's tricky for me as I would have to change JohMLE.src (the implementation is complicated) so as to explicitly test where Trace>Trace-95% and pick that Rank as shown in the output column+1, (IMHO the 'significant' Rank number should be as a line in the output, and available in * Variables Defined, e.g. %%COINTRANK). Please can you include.
(b) In the main program either have 6 if statements (5 variables + 1 i.e. no cointegration) relating to the Ranks with 6 SYSTEM 's each with a differenct ect, or someway to paste just the ect line in one SYSTEM, from a choice of 6 alternative ect lines e.g. if Rank=4
ect ect1 ect2 ect3 ect4; * include the error correction terms
How would I do in RATS?
Re: Model of the US Economy: CointegratedVARModelHandbook
@JOHMLE has an ECT option which defines a VECT[EQUATION] which can be used on the ECT instruction when you have (potentially) more than one cointegration vector. See the SHORTANDLONGVECM.RPF example.
Cointegration is a long-run property of the data. It is used to try to determine if there are long-run "stable" relationships among a set of series. In the case of the book you cite, looking at whether there is an identifiable "money demand" relationship. (Note that the deviation from such a relationship isn't zero; it's a stationary process, which can have considerable variance and considerable serial correlation which is why inference is difficult). Cointegration analysis is not designed to answer the question: what is the best guess of what GDP growth will be over the next year? And that is why the book doesn't mention forecasting in a meaningful way.
Needless to say, rolling sample estimation of a model which is not designed to forecast in the first place is a bad idea. And besides that, pre-test model selection (depending upon test result, choose between models A and B) is generally a bad idea as it makes the forecasts a highly discontinuous function of the data (small change to the data can flip the test result) which increases the chances of serious forecast errors. That's not as big an issue with a univariate model since e.g. different ARMA models can produce almost identical forecasts. With a multivariate model, that's a bigger deal; rank 2 vs rank 3 may have very significant differences in the relationship among the series.
Cointegration is a long-run property of the data. It is used to try to determine if there are long-run "stable" relationships among a set of series. In the case of the book you cite, looking at whether there is an identifiable "money demand" relationship. (Note that the deviation from such a relationship isn't zero; it's a stationary process, which can have considerable variance and considerable serial correlation which is why inference is difficult). Cointegration analysis is not designed to answer the question: what is the best guess of what GDP growth will be over the next year? And that is why the book doesn't mention forecasting in a meaningful way.
Needless to say, rolling sample estimation of a model which is not designed to forecast in the first place is a bad idea. And besides that, pre-test model selection (depending upon test result, choose between models A and B) is generally a bad idea as it makes the forecasts a highly discontinuous function of the data (small change to the data can flip the test result) which increases the chances of serious forecast errors. That's not as big an issue with a univariate model since e.g. different ARMA models can produce almost identical forecasts. With a multivariate model, that's a bigger deal; rank 2 vs rank 3 may have very significant differences in the relationship among the series.