MSSYSREGRESSION Procedures |
@MSSysRegression sets up a Markov switching systems regression, with multiple dependent variables and identical explanatory variables across each equation. @MSRegression should be used for univariate switching regressions. @MSSysRegression can also be used for vector autoregressions if the process mean doesn't switch—if the process mean does switch, you need to use @MSVARSetup instead. (In order for a process to have a fixed mean, it has to be self-contained, so a mean-switching model can't be handled in a general systems regression framework).
You can choose among having coefficients switch (with fixed covariance matrix), covariance matrices switching (with fixed coefficients) and both switching. If the coefficients are switching, you can control which are allowed to switch and which are fixed using the NFIX option. All equations are given the same treatment. The symmetrical handling of the equations isn't really required for maximum likelihood estimation, but it makes estimation using the (much) more efficient EM quite a bit simpler and also greatly simplifies Gibbs sampling.
@MSSysRegression( options )
# list of dependent variables (if MODEL option isn't used)
# list of regressors (if MODEL and EQUATION options aren't used)
Parameters
|
depvar |
dependent variable |
Options
REGIMES=number of regimes[2]
Sets the number of regimes. (The older STATES option can be used instead, though REGIMES is preferred).
SWITCH=[C]/CH/H
This determines what switches among regimes. With SWITCH=C, coefficients switch, but the error variance is the same. With SWITCH=CH, both the coefficients and the variances switch (together). With SWITCH=H, the all coefficients are fixed and variances switch.
NFIX=number of fixed coefficients [0]
If the coefficients switch among regimes, you can use NFIX to allow a certain number of them to be fixed instead. The regressor list needs to be arranged to have the fixed coefficients first.
MODEL=MODEL describing the system [not used]
This allows you to set up the model using (for instance), the SYSTEM instructions. Note that if you use the NFIX option, the variables in each equation in the MODEL have to have the fixed coefficients listed first, which may not be how the SYSTEM instructions order them. If that's the case, you may find it easier to use the EQUATION option for the regressors and use the first supplementary card for the list of dependent variables. (Note that, because the equations have to have an identical form, only the form of the first equation in the MODEL gets used). Note also that you cannot use a MODEL with an error correction term (ECT).
EQUATION=EQUATION describing the regressors [not used]
This allows you to set up the model using (for instance) a list of dependent variables and a previously defined EQUATION. The EQUATION only provides the regressor list (thus replacing the second supplementary card). If you use the NFIX option, you need to make sure that the fixed coefficients are listed first.
Variables Defined
Everything defined by @MSSetup is also defined by @MSSysRegression, which includes it. These are the variables defined specifically by @MSSysRegression, for use in parameter sets for estimation. Not all of them will be in active use in a particular model. In particular, only one of SIGSQ and SIGSQV will be used in a model; the first for a fixed variance, the second for regime-switching variances.
|
BETASYS |
VECT[RECT] of the coefficients which switch among regimes. BETASYS(s) is the RECT of (switching) coefficients in regime s. Column j of BETASYS(s) gives the coefficients for equation j in regime s, that is, BETASYS(s)(i,j) is the coefficient on the ith switching variable in the jth equation in regime s. |
|
GAMMASYS |
RECT of fixed coefficients (will be zero dimension if there are none). GAMMASYS(i,j) is the ith fixed coefficient in the jth equation. |
|
MSSYSREGEQN |
EQUATION with the form of a typical equation in the model. |
|
MSSYSREGMODEL |
MODEL describing the equations and covariance matrix in a regime. |
|
SIGMA |
covariance matrix of residuals if the variance isn't switching (SYMMETRIC) |
|
SIGMAV |
VECTOR[SYMMETRIC] of regime-specific covariance matrices if the variance is switching. |
Examples (of Procedure Itself)
This sets up a Markov Switching VECM with four variables. ECT is a series formed from the linear combination of the levels of the four series (LOGM1, LOGY, RD and RB), so it's differences on differences plus the lagged error correction. Two regimes, coefficients switch, covariances don't.
set dm1 = logm1-logm1{1}
set dy = logy-logy{1}
set drd = rd-rd{1}
set drb = rb-rb{1}
@MSSysRegression(regimes=2,switch=c)
# dm1 dy drd drb
# ect1{1} dm1{1} dy{1} drd{1} drb{1}
This is a VAR with three variables, three lags and CONSTANT, two regimes, with all coefficients and the covariance matrix switching.
system(model=varmodel)
variables logcutil logcpi logpoil
lags 1 to 3
det constant
end(system)
*
@mssysregression(model=varmodel,regimes=2,switch=ch)
Procedures and Functions
@MSSysRegInitial(guessregimes=SERIES[INTEGER]) start end
by default, computes a "standard" set of initial guess values for the parameters. Since it's not clear what you expect as the differences among regimes, everything is copied out from a common multivariate regression. You'll either have to adjust some parameters, or you can use the GUESSREGIMES option to input a SERIES[INTEGER] (with values 1, 2, ..., number of regimes) with "guesses" for which entries are in which regimes, and the guess values will be generated based upon systems regressions over the subsamples.
If you leave off the start and/or end parameters, the maximum available range given the series and lags involved will be used.
@MSSysRegParmset(parmset=PARMSET to define)
defines a PARMSET with the free parameters for the switching regression model (all except the transition probability parameters).
%MSSysRegFVec(time)
returns the vector of likelihoods (not in log form) for the (expanded) states at time.
%MSSysRegProb(time)
returns the likelihood (not logged) of the model at time for the current set of parameters. As a side effect, it computes pt_t1(time) and pt_t(time).
%MSSysRegInit()
does the calculations needed at the start of each function evaluation
%MSSysRegInitTransition()
for a model with fixed transitions, expands the transition matrix if required and scans for negative probabilities. Returns 1 if the probabilities are all non-negative, and 0 if not.
@MSSysRegStdResids start end resids
computes a VECT[SERIES] of one-step standardized residuals for diagnostic purposes.
@MSSysRegResids(regimes=SERIES[INT] of regime,specific=single regime) start end
computes a VECT[SERIES] of regime-specific residuals for computational purposes. (Note, these have no direct diagnostic value—use @MSSysRegStdResids for that). You can either get the residuals for a set of time-varying regime values, using the REGIMES option, which takes values 1, 2, ... number of regimes, or can get all residuals for one specific setting for the regime across the entire sample by using the SPECIFIC option (which again takes a single value from 1, 2, ... number of regimes).
@MSSysRegSetModel(regime=specific regime)
fills the MSSysRegModelMODEL variable with the coefficients and covariance matrix for the chosen regime (which takes values 1, 2, ..., number of regimes).
For EM algorithm
@MSSysRegEMGeneralSetup
needs to be called before using EM to set up the work arrays
@MSSysRegEMStep gstart gend
@MSSysRegEStep( options ) gstart gend
@MSSysRegMStep( options ) gstart gend
@MSSysRegEMStep does the combined E and M steps, @MSSysRegEStep does only the E, and @MSSysRegMStep does only the M. @MSSysRegEMStep and @MSSysRegMStep both have options to restrict the set of parameters that are updated, so some can be fixed.
For Gibbs sampling
@MSSysRegResids( options ) vresids gstart gend
returns a VECT[SERIES] of residuals for a specific regime. These are not useful for diagnostic purposes—use @MSSysRegStdResids for that.
@MSSysRegRelabel swaps
reorders all switching components based upon the index array swaps.
Example
This is from Ehrmann, Ellison and Valla(2003), which does a 3 variable VAR aimed at studying changes to oil price variances. @VARLagSelect with the Hannan-Quinn criterion selects 3 lags, so the @MSSysRegression is set up with two regimes, three lags of each variable plus a CONSTANT, with all coefficients plus covariances switching:
@varlagselect(Lags=6,crit=hq)
# logcutil logcpi logpoil
*
system(model=varmodel)
variables logcutil logcpi logpoil
lags 1 to 3
det constant
end(system)
@mssysregression(model=varmodel,regimes=2,switch=ch)
Note that while the intention is to get regimes to reflect differences in the variance of oil price shocks, there is nothing in the model to force that to be the case. There are 30 regression coefficients and 6 free parameters in the covariance matrix in each regime and changes to any combination of them could end up producing a likelihood-maximizer in a Markov Switching model. Fortunately, in practice, you're more likely to get changes in variance to be the global mode than coefficient changes, so starting out with the goal of finding variance switches is more promising than starting out hoping for interpretable coefficient switches.
This estimates the fixed VAR and sets the estimation range based upon it:
estimate(resids=varres)
*
compute gstart=%regstart(),gend=%regend()
In total, this model has 74 total free parameters (30 regression coefficients and 6 covariance matrix parameters in each regime, plus 2 governing the transition probabilities). That's quite a few parameters to estimate using "variational" methods (such as BFGS) which basically treat the function like a "black box". A Markov Switching system regression in fact has quite a bit of structure, and the most efficient way to estimate it is to use EM (Expectations-Maximization). The implementation of EM for this type of model allows the vast majority of the parameters to be estimated using a probability-weighted linear systems regression. While this has to be re-done with each iteration, this is substantially more efficient than almost blindly trying to play with the individual parameters.
This sets up the parameter set using the variables constructed by @MSSysRegression. BETASYS and SIGMAV are specify to the systems regression (BETASYS for the regression coefficients and SIGMAV for the covariance matrices), while P is standard to any of the Markov Switching procedures.
nonlin(parmset=regparms) betasys sigmav
nonlin(parmset=msparms) p
The next "guesses" the regimes to be 1 and 2 based upon values of standardized residuals for oil (3rd variable) in the VAR and uses that to get guess values for the parameters. The ones with the smaller residuals are put into guess regime number 1, and the larger ones into guess regime number 2. This is only used for producing the initial values of the regression coefficients and (more important) covariance matrices. There's no guarantee that the optimum will have a regimes that break based upon the oil price shock variances, but this gives you a good chance of finding the optimum if that does turn out to be the case. Note: MSRegime is a SERIES[INTEGER] created by all of the MS procedures—that's why a GSET is used rather than SET. The regimes need to be numbered starting at 1.
gset MSRegime = %if(abs(varres(3))/sqrt(%sigma(3,3))<1.0,1,2)
@MSSysRegInitial(guess=MSRegime) gstart gend
The estimation with EM is done fairly easily with:
@MSSysRegEMGeneralSetup
do emits=1,50
@MSSysRegEMStep gstart gend
disp "Iteration" emits "Log likelihood" %logl
end do emits
Unfortunately, while this rather quickly gets to an optimum (and an optimum with a large difference in the oil price shock variance between the two regimes), EM, by its nature, doesn't give standard error estimates for the parameters. The recommended procedure is to then "polish" the estimates with 10 iterations of BHHH. You don't want to use BFGS here because the parameters are already converged, so BFGS won't be able to get a good estimate of the curvature (to get standard errors).
The first step truncates the "P" matrix to remove the bottom row since that's the form used for ML estimation. (EM uses the whole matrix, since it just computes that in one part of a "maximization" step). At each time period, %MSSysRegFVec computes the VECTOR of (non-logged) likelihoods, %MSProb does all the Bayes' formula updates given that and returns the (again, non-logged) likelihood. The LOGL FRML then returns the log of that value as the function value for the entry.
The MAXIMIZE uses a START option to let %MSSysRegInit() compute the ergodic (stationary) probabilities and puts those into PSTAR, which is used by the MS procedures to hold the current "filtered" probabilities of the regimes.
compute p=%xsubmat(p,1,nregimes-1,1,nregimes)
frml logl = f=%MSSysRegFVec(t),fpt=%MSProb(t,f),log(fpt)
maximize(start=(pstar=%MSSysRegInit()),parmset=regparms+msparms,$
method=bhhh,iters=10) logl gstart gend
Output
This shows the output from the EM step (which just traces the log likelihoods) and from the BHHH final estimates. Note that the BHHH intentionally doesn't converge—there's no real advantage to that since (unlike BFGS) the covariance matrix won't really change much from iteration to iteration. Also note that the BHHH log likelihood is slightly better (in this case) than the EM log likelihood. EM actually computes the log likelihood slightly differently (because of how the initial values for the regime probabilities are done) so there will always be a very slight difference, which can go in either direction.
Iteration 1 Log likelihood -1260.92853
Iteration 2 Log likelihood -1198.13645
Iteration 3 Log likelihood -1169.06832
Iteration 4 Log likelihood -1159.42407
Iteration 5 Log likelihood -1154.24922
Iteration 6 Log likelihood -1153.38713
Iteration 7 Log likelihood -1153.31442
Iteration 8 Log likelihood -1153.31141
Iteration 9 Log likelihood -1153.31141
Iteration 10 Log likelihood -1153.31147
Iteration 11 Log likelihood -1153.31149
...
Iteration 49 Log likelihood -1153.31151
Iteration 50 Log likelihood -1153.31151
MAXIMIZE - Estimation by BHHH
NO CONVERGENCE IN 10 ITERATIONS
LAST CRITERION WAS 0.0000000
Monthly Data From 1973:04 To 2000:12
Usable Observations 333
Function Value -1153.2721
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. BETASYS(1)(1,1) 1.3025018 0.0966859 13.47147 0.00000000
2. BETASYS(1)(2,1) -0.1184801 0.1551684 -0.76356 0.44513065
3. BETASYS(1)(3,1) -0.2301416 0.0971064 -2.36999 0.01778834
4. BETASYS(1)(4,1) -0.0712861 0.4861219 -0.14664 0.88341426
5. BETASYS(1)(5,1) -0.3441576 0.7052440 -0.48800 0.62555132
6. BETASYS(1)(6,1) 0.4326661 0.4004606 1.08042 0.27995472
7. BETASYS(1)(7,1) 0.0322534 0.0712970 0.45238 0.65099485
8. BETASYS(1)(8,1) -0.0128712 0.1151107 -0.11182 0.91096926
9. BETASYS(1)(9,1) -0.0303329 0.0685360 -0.44258 0.65806736
10. BETASYS(1)(10,1) 16.1171711 12.1435052 1.32723 0.18443406
11. BETASYS(1)(1,2) -0.0042298 0.0381574 -0.11085 0.91173330
12. BETASYS(1)(2,2) 0.0567823 0.0561936 1.01048 0.31226685
13. BETASYS(1)(3,2) -0.0299722 0.0281244 -1.06570 0.28655914
14. BETASYS(1)(4,2) 1.1110915 0.1226603 9.05828 0.00000000
15. BETASYS(1)(5,2) 0.0076857 0.1463613 0.05251 0.95812071
16. BETASYS(1)(6,2) -0.1331041 0.1033310 -1.28813 0.19769981
17. BETASYS(1)(7,2) 0.0203414 0.0172547 1.17889 0.23844077
18. BETASYS(1)(8,2) -0.0069076 0.0273464 -0.25260 0.80058112
19. BETASYS(1)(9,2) -0.0056940 0.0152100 -0.37436 0.70813872
20. BETASYS(1)(10,2) -5.5079240 3.8261806 -1.43954 0.14999877
21. BETASYS(1)(1,3) -0.1359305 0.2128054 -0.63876 0.52298229
22. BETASYS(1)(2,3) 0.1406804 0.3341148 0.42105 0.67371568
23. BETASYS(1)(3,3) -0.0018366 0.2046396 -0.00897 0.99283914
24. BETASYS(1)(4,3) 1.9101624 0.9360367 2.04069 0.04128147
25. BETASYS(1)(5,3) -1.4219866 1.0299190 -1.38068 0.16737795
26. BETASYS(1)(6,3) -0.4268509 0.8357095 -0.51076 0.60951587
27. BETASYS(1)(7,3) 1.1247632 0.1221871 9.20525 0.00000000
28. BETASYS(1)(8,3) -0.0311582 0.1915569 -0.16266 0.87078804
29. BETASYS(1)(9,3) -0.1288451 0.1205220 -1.06906 0.28504318
30. BETASYS(1)(10,3) -18.4432940 29.8566941 -0.61773 0.53675513
31. BETASYS(2)(1,1) 1.0099719 0.0786500 12.84135 0.00000000
32. BETASYS(2)(2,1) 0.0757769 0.1184549 0.63971 0.52236023
33. BETASYS(2)(3,1) -0.1289917 0.0815790 -1.58119 0.11383525
34. BETASYS(2)(4,1) 0.4351118 0.4084667 1.06523 0.28677098
35. BETASYS(2)(5,1) -1.6477606 0.6487385 -2.53995 0.01108697
36. BETASYS(2)(6,1) 1.2104123 0.5067981 2.38835 0.01692412
37. BETASYS(2)(7,1) 0.0047449 0.0053218 0.89160 0.37260985
38. BETASYS(2)(8,1) -0.0151109 0.0080892 -1.86803 0.06175851
39. BETASYS(2)(9,1) 0.0078826 0.0049171 1.60308 0.10891595
40. BETASYS(2)(10,1) 21.0452145 9.6384022 2.18348 0.02900082
41. BETASYS(2)(1,2) -0.0058633 0.0153751 -0.38135 0.70294499
42. BETASYS(2)(2,2) -0.0109034 0.0199546 -0.54641 0.58478179
43. BETASYS(2)(3,2) 0.0260421 0.0131540 1.97979 0.04772704
44. BETASYS(2)(4,2) 1.0624433 0.0709570 14.97306 0.00000000
45. BETASYS(2)(5,2) -0.1426209 0.0974345 -1.46376 0.14325911
46. BETASYS(2)(6,2) 0.0752892 0.0548892 1.37166 0.17017027
47. BETASYS(2)(7,2) 0.0017384 0.0008791 1.97738 0.04799906
48. BETASYS(2)(8,2) -0.0018626 0.0013142 -1.41722 0.15641910
49. BETASYS(2)(9,2) 0.0008719 0.0007719 1.12957 0.25865920
50. BETASYS(2)(10,2) -1.5828601 1.1938910 -1.32580 0.18490611
51. BETASYS(2)(1,3) 1.5432515 1.5792307 0.97722 0.32846162
52. BETASYS(2)(2,3) -0.8768512 2.0595459 -0.42575 0.67029017
53. BETASYS(2)(3,3) -0.8624865 1.2961518 -0.66542 0.50578135
54. BETASYS(2)(4,3) -0.1045067 7.4597585 -0.01401 0.98882248
55. BETASYS(2)(5,3) 19.3910528 11.4624206 1.69171 0.09070193
56. BETASYS(2)(6,3) -19.1875197 6.9773537 -2.74997 0.00596006
57. BETASYS(2)(7,3) 1.0983810 0.0702989 15.62444 0.00000000
58. BETASYS(2)(8,3) -0.1761646 0.0917278 -1.92051 0.05479306
59. BETASYS(2)(9,3) -0.0439301 0.0641593 -0.68470 0.49353099
60. BETASYS(2)(10,3) 66.8447170 179.1300609 0.37316 0.70902711
61. SIGMAV(1)(1,1) 0.7747708 0.0905635 8.55500 0.00000000
62. SIGMAV(1)(2,1) 0.0335175 0.0315907 1.06099 0.28869338
63. SIGMAV(1)(2,2) 0.0539895 0.0070497 7.65846 0.00000000
64. SIGMAV(1)(3,1) 0.2189912 0.1601335 1.36755 0.17145185
65. SIGMAV(1)(3,2) 0.0585223 0.0498984 1.17283 0.24086406
66. SIGMAV(1)(3,3) 3.2731541 0.3523026 9.29075 0.00000000
67. SIGMAV(2)(1,1) 0.2857325 0.0351317 8.13317 0.00000000
68. SIGMAV(2)(2,1) -0.0011356 0.0041482 -0.27374 0.78428095
69. SIGMAV(2)(2,2) 0.0079367 0.0009470 8.38100 0.00000000
70. SIGMAV(2)(3,1) -0.3336294 0.4603887 -0.72467 0.46865512
71. SIGMAV(2)(3,2) 0.1333017 0.0694924 1.91822 0.05508309
72. SIGMAV(2)(3,3) 93.4450870 9.5924869 9.74149 0.00000000
73. P(1,1) 0.9732715 0.0163812 59.41402 0.00000000
74. P(1,2) 0.0176582 0.0190685 0.92604 0.35442354
Copyright © 2025 Thomas A. Doan