MSSYSREGRESSION Procedures

@MSSysRegression sets up a Markov switching systems regression, with multiple dependent variables and identical explanatory variables across each equation. @MSRegression should be used for univariate switching regressions. @MSSysRegression can also be used for vector autoregressions if the process mean doesn't switch—if the process mean does switch, you need to use @MSVARSetup instead. (In order for a process to have a fixed mean, it has to be self-contained, so a mean-switching model can't be handled in a general systems regression framework).

You can choose among having coefficients switch (with fixed covariance matrix), covariance matrices switching (with fixed coefficients) and both switching. If the coefficients are switching, you can control which are allowed to switch and which are fixed using the NFIX option. All equations are given the same treatment. The symmetrical handling of the equations isn't really required for maximum likelihood estimation, but it makes estimation using the (much) more efficient EM quite a bit simpler and also greatly simplifies Gibbs sampling.

@MSSysRegression( options )

# list of dependent variables (if MODEL option isn't used)

# list of regressors (if MODEL and EQUATION options aren't used)

Parameters

depvar

dependent variable

Options

REGIMES=number of regimes[2]

Sets the number of regimes. (The older STATES option can be used instead, though REGIMES is preferred).

SWITCH=[C]/CH/H

This determines what switches among regimes. With SWITCH=C, coefficients switch, but the error variance is the same. With SWITCH=CH, both the coefficients and the variances switch (together). With SWITCH=H, the all coefficients are fixed and variances switch.

NFIX=number of fixed coefficients [0]

If the coefficients switch among regimes, you can use NFIX to allow a certain number of them to be fixed instead. The regressor list needs to be arranged to have the fixed coefficients first.

MODEL=MODEL describing the system [not used]

This allows you to set up the model using (for instance), the SYSTEM instructions. Note that if you use the NFIX option, the variables in each equation in the MODEL have to have the fixed coefficients listed first, which may not be how the SYSTEM instructions order them. If that's the case, you may find it easier to use the EQUATION option for the regressors and use the first supplementary card for the list of dependent variables. (Note that, because the equations have to have an identical form, only the form of the first equation in the MODEL gets used). Note also that you cannot use a MODEL with an error correction term (ECT).

EQUATION=EQUATION describing the regressors [not used]

This allows you to set up the model using (for instance) a list of dependent variables and a previously defined EQUATION. The EQUATION only provides the regressor list (thus replacing the second supplementary card). If you use the NFIX option, you need to make sure that the fixed coefficients are listed first.

Variables Defined

Everything defined by @MSSetup is also defined by @MSSysRegression, which includes it. These are the variables defined specifically by @MSSysRegression, for use in parameter sets for estimation. Not all of them will be in active use in a particular model. In particular, only one of SIGSQ and SIGSQV will be used in a model; the first for a fixed variance, the second for regime-switching variances.

BETASYS	VECT[RECT] of the coefficients which switch among regimes. BETASYS(s) is the RECT of (switching) coefficients in regime s. Column j of BETASYS(s) gives the coefficients for equation j in regime s, that is, BETASYS(s)(i,j) is the coefficient on the ith switching variable in the jth equation in regime s.
GAMMASYS	RECT of fixed coefficients (will be zero dimension if there are none). GAMMASYS(i,j) is the ith fixed coefficient in the jth equation.
MSSYSREGEQN	EQUATION with the form of a typical equation in the model.
MSSYSREGMODEL	MODEL describing the equations and covariance matrix in a regime.
SIGMA	covariance matrix of residuals if the variance isn't switching (SYMMETRIC)
SIGMAV	VECTOR[SYMMETRIC] of regime-specific covariance matrices if the variance is switching.

Examples (of Procedure Itself)

This sets up a Markov Switching VECM with four variables. ECT is a series formed from the linear combination of the levels of the four series (LOGM1, LOGY, RD and RB), so it's differences on differences plus the lagged error correction. Two regimes, coefficients switch, covariances don't.

set dm1 = logm1-logm1{1}

set dy = logy-logy{1}

set drd = rd-rd{1}

set drb = rb-rb{1}

@MSSysRegression(regimes=2,switch=c)

# dm1 dy drd drb

# ect1{1} dm1{1} dy{1} drd{1} drb{1}

This is a VAR with three variables, three lags and CONSTANT, two regimes, with all coefficients and the covariance matrix switching.

system(model=varmodel)

variables logcutil logcpi logpoil

lags 1 to 3

det constant

end(system)

@mssysregression(model=varmodel,regimes=2,switch=ch)

Procedures and Functions

@MSSysRegInitial(guessregimes=SERIES[INTEGER]) start end

by default, computes a "standard" set of initial guess values for the parameters. Since it's not clear what you expect as the differences among regimes, everything is copied out from a common multivariate regression. You'll either have to adjust some parameters, or you can use the GUESSREGIMES option to input a SERIES[INTEGER] (with values 1, 2, ..., number of regimes) with "guesses" for which entries are in which regimes, and the guess values will be generated based upon systems regressions over the subsamples.

If you leave off the start and/or end parameters, the maximum available range given the series and lags involved will be used.

@MSSysRegParmset(parmset=PARMSET to define)

defines a PARMSET with the free parameters for the switching regression model (all except the transition probability parameters).

%MSSysRegFVec(time)

returns the vector of likelihoods (not in log form) for the (expanded) states at time.

%MSSysRegProb(time)

returns the likelihood (not logged) of the model at time for the current set of parameters. As a side effect, it computes pt_t1(time) and pt_t(time).

%MSSysRegInit()

does the calculations needed at the start of each function evaluation

%MSSysRegInitTransition()

for a model with fixed transitions, expands the transition matrix if required and scans for negative probabilities. Returns 1 if the probabilities are all non-negative, and 0 if not.

@MSSysRegStdResids start end resids

computes a VECT[SERIES] of one-step standardized residuals for diagnostic purposes.

@MSSysRegResids(regimes=SERIES[INT] of regime,specific=single regime) start end

computes a VECT[SERIES] of regime-specific residuals for computational purposes. (Note, these have no direct diagnostic value—use @MSSysRegStdResids for that). You can either get the residuals for a set of time-varying regime values, using the REGIMES option, which takes values 1, 2, ... number of regimes, or can get all residuals for one specific setting for the regime across the entire sample by using the SPECIFIC option (which again takes a single value from 1, 2, ... number of regimes).

@MSSysRegSetModel(regime=specific regime)

fills the MSSysRegModelMODEL variable with the coefficients and covariance matrix for the chosen regime (which takes values 1, 2, ..., number of regimes).

For EM algorithm

@MSSysRegEMGeneralSetup

needs to be called before using EM to set up the work arrays

@MSSysRegEMStep gstart gend

@MSSysRegEStep( options ) gstart gend

@MSSysRegMStep( options ) gstart gend

@MSSysRegEMStep does the combined E and M steps, @MSSysRegEStep does only the E, and @MSSysRegMStep does only the M. @MSSysRegEMStep and @MSSysRegMStep both have options to restrict the set of parameters that are updated, so some can be fixed.

For Gibbs sampling

@MSSysRegResids( options ) vresids gstart gend

returns a VECT[SERIES] of residuals for a specific regime. These are not useful for diagnostic purposes—use @MSSysRegStdResids for that.

@MSSysRegRelabel swaps

reorders all switching components based upon the index array swaps.

Example

This is from Ehrmann, Ellison and Valla(2003), which does a 3 variable VAR aimed at studying changes to oil price variances. @VARLagSelect with the Hannan-Quinn criterion selects 3 lags, so the @MSSysRegression is set up with two regimes, three lags of each variable plus a CONSTANT, with all coefficients plus covariances switching:

@varlagselect(Lags=6,crit=hq)

# logcutil logcpi logpoil

system(model=varmodel)

variables logcutil logcpi logpoil

lags 1 to 3

det constant

end(system)

@mssysregression(model=varmodel,regimes=2,switch=ch)

Note that while the intention is to get regimes to reflect differences in the variance of oil price shocks, there is nothing in the model to force that to be the case. There are 30 regression coefficients and 6 free parameters in the covariance matrix in each regime and changes to any combination of them could end up producing a likelihood-maximizer in a Markov Switching model. Fortunately, in practice, you're more likely to get changes in variance to be the global mode than coefficient changes, so starting out with the goal of finding variance switches is more promising than starting out hoping for interpretable coefficient switches.

This estimates the fixed VAR and sets the estimation range based upon it:

estimate(resids=varres)

compute gstart=%regstart(),gend=%regend()

In total, this model has 74 total free parameters (30 regression coefficients and 6 covariance matrix parameters in each regime, plus 2 governing the transition probabilities). That's quite a few parameters to estimate using "variational" methods (such as BFGS) which basically treat the function like a "black box". A Markov Switching system regression in fact has quite a bit of structure, and the most efficient way to estimate it is to use EM (Expectations-Maximization). The implementation of EM for this type of model allows the vast majority of the parameters to be estimated using a probability-weighted linear systems regression. While this has to be re-done with each iteration, this is substantially more efficient than almost blindly trying to play with the individual parameters.

This sets up the parameter set using the variables constructed by @MSSysRegression. BETASYS and SIGMAV are specify to the systems regression (BETASYS for the regression coefficients and SIGMAV for the covariance matrices), while P is standard to any of the Markov Switching procedures.

nonlin(parmset=regparms) betasys sigmav

nonlin(parmset=msparms) p

The next "guesses" the regimes to be 1 and 2 based upon values of standardized residuals for oil (3rd variable) in the VAR and uses that to get guess values for the parameters. The ones with the smaller residuals are put into guess regime number 1, and the larger ones into guess regime number 2. This is only used for producing the initial values of the regression coefficients and (more important) covariance matrices. There's no guarantee that the optimum will have a regimes that break based upon the oil price shock variances, but this gives you a good chance of finding the optimum if that does turn out to be the case. Note: MSRegime is a SERIES[INTEGER] created by all of the MS procedures—that's why a GSET is used rather than SET. The regimes need to be numbered starting at 1.

gset MSRegime = %if(abs(varres(3))/sqrt(%sigma(3,3))<1.0,1,2)

@MSSysRegInitial(guess=MSRegime) gstart gend

The estimation with EM is done fairly easily with:

@MSSysRegEMGeneralSetup

do emits=1,50

@MSSysRegEMStep gstart gend

disp "Iteration" emits "Log likelihood" %logl

end do emits

Unfortunately, while this rather quickly gets to an optimum (and an optimum with a large difference in the oil price shock variance between the two regimes), EM, by its nature, doesn't give standard error estimates for the parameters. The recommended procedure is to then "polish" the estimates with 10 iterations of BHHH. You don't want to use BFGS here because the parameters are already converged, so BFGS won't be able to get a good estimate of the curvature (to get standard errors).

The first step truncates the "P" matrix to remove the bottom row since that's the form used for ML estimation. (EM uses the whole matrix, since it just computes that in one part of a "maximization" step). At each time period, %MSSysRegFVec computes the VECTOR of (non-logged) likelihoods, %MSProb does all the Bayes' formula updates given that and returns the (again, non-logged) likelihood. The LOGL FRML then returns the log of that value as the function value for the entry.

The MAXIMIZE uses a START option to let %MSSysRegInit() compute the ergodic (stationary) probabilities and puts those into PSTAR, which is used by the MS procedures to hold the current "filtered" probabilities of the regimes.

compute p=%xsubmat(p,1,nregimes-1,1,nregimes)

frml logl = f=%MSSysRegFVec(t),fpt=%MSProb(t,f),log(fpt)

maximize(start=(pstar=%MSSysRegInit()),parmset=regparms+msparms,$

method=bhhh,iters=10) logl gstart gend

Output

This shows the output from the EM step (which just traces the log likelihoods) and from the BHHH final estimates. Note that the BHHH intentionally doesn't converge—there's no real advantage to that since (unlike BFGS) the covariance matrix won't really change much from iteration to iteration. Also note that the BHHH log likelihood is slightly better (in this case) than the EM log likelihood. EM actually computes the log likelihood slightly differently (because of how the initial values for the regime probabilities are done) so there will always be a very slight difference, which can go in either direction.

Iteration 1 Log likelihood -1260.92853

Iteration 2 Log likelihood -1198.13645

Iteration 3 Log likelihood -1169.06832

Iteration 4 Log likelihood -1159.42407

Iteration 5 Log likelihood -1154.24922

Iteration 6 Log likelihood -1153.38713

Iteration 7 Log likelihood -1153.31442

Iteration 8 Log likelihood -1153.31141

Iteration 9 Log likelihood -1153.31141

Iteration 10 Log likelihood -1153.31147

Iteration 11 Log likelihood -1153.31149

...

Iteration 49 Log likelihood -1153.31151

Iteration 50 Log likelihood -1153.31151

MAXIMIZE - Estimation by BHHH

NO CONVERGENCE IN 10 ITERATIONS

LAST CRITERION WAS 0.0000000

Monthly Data From 1973:04 To 2000:12

Usable Observations 333

Function Value -1153.2721

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. BETASYS(1)(1,1) 1.3025018 0.0966859 13.47147 0.00000000

2. BETASYS(1)(2,1) -0.1184801 0.1551684 -0.76356 0.44513065

3. BETASYS(1)(3,1) -0.2301416 0.0971064 -2.36999 0.01778834

4. BETASYS(1)(4,1) -0.0712861 0.4861219 -0.14664 0.88341426

5. BETASYS(1)(5,1) -0.3441576 0.7052440 -0.48800 0.62555132

6. BETASYS(1)(6,1) 0.4326661 0.4004606 1.08042 0.27995472

7. BETASYS(1)(7,1) 0.0322534 0.0712970 0.45238 0.65099485

8. BETASYS(1)(8,1) -0.0128712 0.1151107 -0.11182 0.91096926

9. BETASYS(1)(9,1) -0.0303329 0.0685360 -0.44258 0.65806736

10. BETASYS(1)(10,1) 16.1171711 12.1435052 1.32723 0.18443406

11. BETASYS(1)(1,2) -0.0042298 0.0381574 -0.11085 0.91173330

12. BETASYS(1)(2,2) 0.0567823 0.0561936 1.01048 0.31226685

13. BETASYS(1)(3,2) -0.0299722 0.0281244 -1.06570 0.28655914

14. BETASYS(1)(4,2) 1.1110915 0.1226603 9.05828 0.00000000

15. BETASYS(1)(5,2) 0.0076857 0.1463613 0.05251 0.95812071

16. BETASYS(1)(6,2) -0.1331041 0.1033310 -1.28813 0.19769981

17. BETASYS(1)(7,2) 0.0203414 0.0172547 1.17889 0.23844077

18. BETASYS(1)(8,2) -0.0069076 0.0273464 -0.25260 0.80058112

19. BETASYS(1)(9,2) -0.0056940 0.0152100 -0.37436 0.70813872

20. BETASYS(1)(10,2) -5.5079240 3.8261806 -1.43954 0.14999877

21. BETASYS(1)(1,3) -0.1359305 0.2128054 -0.63876 0.52298229

22. BETASYS(1)(2,3) 0.1406804 0.3341148 0.42105 0.67371568

23. BETASYS(1)(3,3) -0.0018366 0.2046396 -0.00897 0.99283914

24. BETASYS(1)(4,3) 1.9101624 0.9360367 2.04069 0.04128147

25. BETASYS(1)(5,3) -1.4219866 1.0299190 -1.38068 0.16737795

26. BETASYS(1)(6,3) -0.4268509 0.8357095 -0.51076 0.60951587

27. BETASYS(1)(7,3) 1.1247632 0.1221871 9.20525 0.00000000

28. BETASYS(1)(8,3) -0.0311582 0.1915569 -0.16266 0.87078804

29. BETASYS(1)(9,3) -0.1288451 0.1205220 -1.06906 0.28504318

30. BETASYS(1)(10,3) -18.4432940 29.8566941 -0.61773 0.53675513

31. BETASYS(2)(1,1) 1.0099719 0.0786500 12.84135 0.00000000

32. BETASYS(2)(2,1) 0.0757769 0.1184549 0.63971 0.52236023

33. BETASYS(2)(3,1) -0.1289917 0.0815790 -1.58119 0.11383525

34. BETASYS(2)(4,1) 0.4351118 0.4084667 1.06523 0.28677098

35. BETASYS(2)(5,1) -1.6477606 0.6487385 -2.53995 0.01108697

36. BETASYS(2)(6,1) 1.2104123 0.5067981 2.38835 0.01692412

37. BETASYS(2)(7,1) 0.0047449 0.0053218 0.89160 0.37260985

38. BETASYS(2)(8,1) -0.0151109 0.0080892 -1.86803 0.06175851

39. BETASYS(2)(9,1) 0.0078826 0.0049171 1.60308 0.10891595

40. BETASYS(2)(10,1) 21.0452145 9.6384022 2.18348 0.02900082

41. BETASYS(2)(1,2) -0.0058633 0.0153751 -0.38135 0.70294499

42. BETASYS(2)(2,2) -0.0109034 0.0199546 -0.54641 0.58478179

43. BETASYS(2)(3,2) 0.0260421 0.0131540 1.97979 0.04772704

44. BETASYS(2)(4,2) 1.0624433 0.0709570 14.97306 0.00000000

45. BETASYS(2)(5,2) -0.1426209 0.0974345 -1.46376 0.14325911

46. BETASYS(2)(6,2) 0.0752892 0.0548892 1.37166 0.17017027

47. BETASYS(2)(7,2) 0.0017384 0.0008791 1.97738 0.04799906

48. BETASYS(2)(8,2) -0.0018626 0.0013142 -1.41722 0.15641910

49. BETASYS(2)(9,2) 0.0008719 0.0007719 1.12957 0.25865920

50. BETASYS(2)(10,2) -1.5828601 1.1938910 -1.32580 0.18490611

51. BETASYS(2)(1,3) 1.5432515 1.5792307 0.97722 0.32846162

52. BETASYS(2)(2,3) -0.8768512 2.0595459 -0.42575 0.67029017

53. BETASYS(2)(3,3) -0.8624865 1.2961518 -0.66542 0.50578135

54. BETASYS(2)(4,3) -0.1045067 7.4597585 -0.01401 0.98882248

55. BETASYS(2)(5,3) 19.3910528 11.4624206 1.69171 0.09070193

56. BETASYS(2)(6,3) -19.1875197 6.9773537 -2.74997 0.00596006

57. BETASYS(2)(7,3) 1.0983810 0.0702989 15.62444 0.00000000

58. BETASYS(2)(8,3) -0.1761646 0.0917278 -1.92051 0.05479306

59. BETASYS(2)(9,3) -0.0439301 0.0641593 -0.68470 0.49353099

60. BETASYS(2)(10,3) 66.8447170 179.1300609 0.37316 0.70902711

61. SIGMAV(1)(1,1) 0.7747708 0.0905635 8.55500 0.00000000

62. SIGMAV(1)(2,1) 0.0335175 0.0315907 1.06099 0.28869338

63. SIGMAV(1)(2,2) 0.0539895 0.0070497 7.65846 0.00000000

64. SIGMAV(1)(3,1) 0.2189912 0.1601335 1.36755 0.17145185

65. SIGMAV(1)(3,2) 0.0585223 0.0498984 1.17283 0.24086406

66. SIGMAV(1)(3,3) 3.2731541 0.3523026 9.29075 0.00000000

67. SIGMAV(2)(1,1) 0.2857325 0.0351317 8.13317 0.00000000

68. SIGMAV(2)(2,1) -0.0011356 0.0041482 -0.27374 0.78428095

69. SIGMAV(2)(2,2) 0.0079367 0.0009470 8.38100 0.00000000

70. SIGMAV(2)(3,1) -0.3336294 0.4603887 -0.72467 0.46865512

71. SIGMAV(2)(3,2) 0.1333017 0.0694924 1.91822 0.05508309

72. SIGMAV(2)(3,3) 93.4450870 9.5924869 9.74149 0.00000000

73. P(1,1) 0.9732715 0.0163812 59.41402 0.00000000

74. P(1,2) 0.0176582 0.0190685 0.92604 0.35442354