Statistics and Algorithms / GARCH Models / GARCH Models (Multivariate) / MV GARCH VECH Models /

BEKK, Diagonal BEKK(DBEKK), Triangular BEKK(TBEKK)

The BEKK formulation (Engle and Kroner, 1995) directly imposes positive definiteness on the variance matrix:

\begin{equation} {\bf{H}}_t {\rm{ = }}{\bf{CC'}}{\rm{ + }}{\bf{A'u}}_{t - 1} {\bf{u'}}_{t - 1} {\bf{A}}{\rm{ + }}{\bf{B'}}{\kern 1pt} {\bf{H}}_{t - 1} {\bf{B}} \label{eq:garch_bekk} \end{equation}

As each term is positive semi-definite by construction, this will avoid bad regions. However, while positive-definiteness is assured, it comes at the cost of a poorly behaved likelihood function. With all the parameters entering through quadratic forms, they aren’t globally identified—changing the signs of all elements of \(\bf{C}\), \(\bf{B}\) or \(\bf{A}\) will have no effect on the function value. The guess values used by RATS tend to force them towards positive values on the diagonal, but if a GARCH model doesn't fit the data particularly well, it's possible for them to "flip" signs at some point in the estimation.

\(\bf{C}\) can have only \(n(n+1)/2\) free parameters—GARCH parameterizes it to be lower triangular. This term is often written as the equivalent \({\bf{C'C}}\) where \(\bf{C}\) is upper triangular—the resulting product matrix will be the same in either case.

We have chosen to retain the positioning of the transposes for \(\bf{A}\) and \(\bf{B}\) in \eqref{eq:garch_bekk} that were used in the original papers and are now standard in the literature. However, note that that switches the meanings of subscripts in those matrices from what is common in most multivariate analysis (not just GARCH, but VAR’s as well): for the BEKK, A(i,j) will have the effect of residual i on variable j, so the target is in the column, not the row.

Note that, while in most cases, the estimates wiill have positive values on the diagonals of \(\bf{A}\) and \(\bf{B}\), it is not unreasonable (and in fact not unexpected) for some off-diagonal elements of \({\bf{A}}\) (in particular) and \({\bf{B}}\) to be negative even where the diagonal elements are positive. This is most easily seen with \({\bf{A}}\): define \({{\bf{v}}_{t - 1}} = {\bf{A'}}{{\bf{u}}_{t - 1}}\), which is an \(n\) vector. The contribution of the "ARCH" term to the covariance matrix is then \({{\bf{v}}_{t - 1}}{{{\bf{v}}}_{t - 1}}^{\prime}\) which means that the squares of the elements of \({\bf{v}}\) will be the contributions to the variances themselves. To have "spillover" effects, so that shocks in one component affect the variance of another, \({\bf{v}}\) will have to be a linear combination of the different components of \({\bf{u}}\). Negative coefficients in the off-diagonals of \({\bf{A}}\) mean that the variance is affected more when the shocks move in opposite directions than when they move in the same direction, which probably isn’t unreasonable in many situations. It's also possible (unlikely in practice, but still possible), for the diagonal elements in \({\bf{A}}\) (or less likely \({\bf{B}}\)) to have opposite signs. For instance, if the correlation between two components is near zero, the sign of any column of \({\bf{A}}\) (or \({\bf{B}}\)) has little effect on the likelihood. In most applications, the correlations among the variables tends to be high and positive, but increasingly GARCH models are being applied to series for which that is not the case.

Another common question is how it’s possible for the off-diagonals in the \({\bf{A}}\) and \({\bf{B}}\) matrices to be larger than the diagonals, since one would expect that the “own” effect would be dominant. However, the values of the coefficients are sensitive to the scales of the variables, since nothing in the recursion is standardized to a common variance. If you multiply component i by .01 relative to j, its residuals also are multiplied by a factor of .01, so the coefficient A(i,j) which applies residual i to the variance of j has to go up by a factor of 100. Rescaling a variable keeps the diagonals of \({\bf{A}}\) and \({\bf{B}}\) the same, but forces a change in scale of the off-diagonals. Even without asymmetrical scalings, the tendency will be for (relatively) higher variance series to have lower off-diagonal coefficients than lower variance series.

Choose the BEKK model with MV=BEKK. DBEKK and TBEKK are restricted forms of this. DBEKK (diagonal BEKK, chosen with MV=DBEKK) makes \(\bf{A}\) and \(\bf{B}\) diagonal. TBEKK (triangular BEKK, chosen with MV=TBEKK) makes the pre-multiplying matrices \({\bf{A'}}\) and \({\bf{B'}}\) in \eqref{eq:garch_bekk} lower triangular. (The coefficients will actually be reported for the lower triangles). This creates a recursive ordering among the variables, and (unlike almost all other model types) makes the model order important. (TBEKK is sometimes defined with the pre-multiplying matrix being upper triangular, which produces a counter-intuitive situation where the first variable in the model is actually last in the ordering of effects.)

A BEKK model can also be difficult to fit, which is why the example is using a small number of preliminary simplex iterations. The example of this from GARCHMV.RPF is

garch(p=1,q=1,mv=bekk,pmethod=simplex,piters=10) / xjpn xfra xsui

Because of the structure of the BEKK model, extensions to allow asymmetry or variance shifts are more complicated than they are for most other model forms.

BEKK is a special case of the VECH because, for instance, \({\bf{B'H}}_{t - 1} {\bf{B}}\) is a linear function of the elements of \({\bf{H}}_{t - 1} \), with the coefficients being a complicated quadratic in \(\bf{B}\). (Similarly for the "ARCH" term). It is possible to use SUMMARIZE to compute approximate standard errors for the VECH representation coefficients, but there is very little useful information in those.

Output

MV-GARCH, BEKK - Estimation by BFGS

Convergence in 86 Iterations. Final criterion was 0.0000086 <= 0.0000100

Usable Observations 6236

Log Likelihood -11821.7457

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Mean(XJPN) 0.005284238 0.005779110 0.91437 0.36052308

2. Mean(XFRA) -0.002360430 0.004150748 -0.56868 0.56957616

3. Mean(XSUI) -0.002505826 0.004919824 -0.50933 0.61051925

4. C(1,1) 0.082827551 0.005082889 16.29537 0.00000000

5. C(2,1) 0.029966933 0.006774462 4.42352 0.00000971

6. C(2,2) 0.055802023 0.004748283 11.75204 0.00000000

7. C(3,1) 0.037995437 0.007654196 4.96400 0.00000069

8. C(3,2) -0.004017902 0.006806450 -0.59031 0.55498413

9. C(3,3) 0.058506480 0.006245748 9.36741 0.00000000

10. A(1,1) 0.359535262 0.012077169 29.76983 0.00000000

11. A(1,2) 0.102691494 0.009048917 11.34848 0.00000000

12. A(1,3) 0.111082248 0.011812194 9.40403 0.00000000

13. A(2,1) 0.038123247 0.014041254 2.71509 0.00662580

14. A(2,2) 0.403444341 0.016261862 24.80923 0.00000000

15. A(2,3) -0.066355330 0.013370391 -4.96286 0.00000069

16. A(3,1) -0.047522551 0.010449514 -4.54782 0.00000542

17. A(3,2) -0.125553482 0.012149506 -10.33404 0.00000000

18. A(3,3) 0.291344292 0.010526939 27.67607 0.00000000

19. B(1,1) 0.935272064 0.003791535 246.67373 0.00000000

20. B(1,2) -0.026717483 0.003105677 -8.60279 0.00000000

21. B(1,3) -0.028574502 0.004086792 -6.99192 0.00000000

22. B(2,1) -0.012475081 0.005562790 -2.24259 0.02492300

23. B(2,2) 0.909746081 0.006295125 144.51597 0.00000000

24. B(2,3) 0.029269416 0.005403727 5.41652 0.00000006

25. B(3,1) 0.016548608 0.004489013 3.68647 0.00022739

26. B(3,2) 0.048830173 0.004991829 9.78202 0.00000000

27. B(3,3) 0.946852761 0.004722801 200.48544 0.00000000