Statistics and Algorithms / GARCH Models / GARCH Models (Multivariate) / MV GARCH VECH Models / BEKK, Diagonal BEKK(DBEKK), Triangular BEKK(TBEKK) |
The BEKK formulation (Engle and Kroner, 1995) directly imposes positive definiteness on the variance matrix:
\begin{equation} {\bf{H}}_t {\rm{ = }}{\bf{CC'}}{\rm{ + }}{\bf{A'u}}_{t - 1} {\bf{u'}}_{t - 1} {\bf{A}}{\rm{ + }}{\bf{B'}}{\kern 1pt} {\bf{H}}_{t - 1} {\bf{B}} \label{eq:garch_bekk} \end{equation}
As each term is positive semi-definite by construction, this will avoid bad regions. However, while positive-definiteness is assured, it comes at the cost of a poorly behaved likelihood function. With all the parameters entering through quadratic forms, they aren’t globally identified—changing the signs of all elements of \(\bf{C}\), \(\bf{B}\) or \(\bf{A}\) will have no effect on the function value. The guess values used by RATS tend to force them towards positive values on the diagonal, but if a GARCH model doesn't fit the data particularly well, it's possible for them to "flip" signs at some point in the estimation.
\(\bf{C}\) can have only \(n(n+1)/2\) free parameters—GARCH parameterizes it to be lower triangular. This term is often written as the equivalent \({\bf{C'C}}\) where \(\bf{C}\) is upper triangular—the resulting product matrix will be the same in either case.
We have chosen to retain the positioning of the transposes for \(\bf{A}\) and \(\bf{B}\) in \eqref{eq:garch_bekk} that were used in the original papers and are now standard in the literature. However, note that that switches the meanings of subscripts in those matrices from what is common in most multivariate analysis (not just GARCH, but VAR’s as well): for the BEKK, A(i,j) will have the effect of residual i on variable j, so the target is in the column, not the row.
Note that, while in most cases, the estimates wiill have positive values on the diagonals of \(\bf{A}\) and \(\bf{B}\), it is not unreasonable (and in fact not unexpected) for some off-diagonal elements of \({\bf{A}}\) (in particular) and \({\bf{B}}\) to be negative even where the diagonal elements are positive. This is most easily seen with \({\bf{A}}\): define \({{\bf{v}}_{t - 1}} = {\bf{A'}}{{\bf{u}}_{t - 1}}\), which is an \(n\) vector. The contribution of the "ARCH" term to the covariance matrix is then \({{\bf{v}}_{t - 1}}{{{\bf{v}}}_{t - 1}}^{\prime}\) which means that the squares of the elements of \({\bf{v}}\) will be the contributions to the variances themselves. To have "spillover" effects, so that shocks in one component affect the variance of another, \({\bf{v}}\) will have to be a linear combination of the different components of \({\bf{u}}\). Negative coefficients in the off-diagonals of \({\bf{A}}\) mean that the variance is affected more when the shocks move in opposite directions than when they move in the same direction, which probably isn’t unreasonable in many situations. It's also possible (unlikely in practice, but still possible), for the diagonal elements in \({\bf{A}}\) (or less likely \({\bf{B}}\)) to have opposite signs. For instance, if the correlation between two components is near zero, the sign of any column of \({\bf{A}}\) (or \({\bf{B}}\)) has little effect on the likelihood. In most applications, the correlations among the variables tends to be high and positive, but increasingly GARCH models are being applied to series for which that is not the case.
Another common question is how it’s possible for the off-diagonals in the \({\bf{A}}\) and \({\bf{B}}\) matrices to be larger than the diagonals, since one would expect that the “own” effect would be dominant. However, the values of the coefficients are sensitive to the scales of the variables, since nothing in the recursion is standardized to a common variance. If you multiply component i by .01 relative to j, its residuals also are multiplied by a factor of .01, so the coefficient A(i,j) which applies residual i to the variance of j has to go up by a factor of 100. Rescaling a variable keeps the diagonals of \({\bf{A}}\) and \({\bf{B}}\) the same, but forces a change in scale of the off-diagonals. Even without asymmetrical scalings, the tendency will be for (relatively) higher variance series to have lower off-diagonal coefficients than lower variance series.
Choose the BEKK model with MV=BEKK. DBEKK and TBEKK are restricted forms of this. DBEKK (diagonal BEKK, chosen with MV=DBEKK) makes \(\bf{A}\) and \(\bf{B}\) diagonal. TBEKK (triangular BEKK, chosen with MV=TBEKK) makes the pre-multiplying matrices \({\bf{A'}}\) and \({\bf{B'}}\) in \eqref{eq:garch_bekk} lower triangular. (The coefficients will actually be reported for the lower triangles). This creates a recursive ordering among the variables, and (unlike almost all other model types) makes the model order important. (TBEKK is sometimes defined with the pre-multiplying matrix being upper triangular, which produces a counter-intuitive situation where the first variable in the model is actually last in the ordering of effects.)
A BEKK model can also be difficult to fit, which is why the example is using a small number of preliminary simplex iterations. The example of this from GARCHMV.RPF is
garch(p=1,q=1,mv=bekk,pmethod=simplex,piters=10) / xjpn xfra xsui
Because of the structure of the BEKK model, extensions to allow asymmetry or variance shifts are more complicated than they are for most other model forms.
BEKK is a special case of the VECH because, for instance, \({\bf{B'H}}_{t - 1} {\bf{B}}\) is a linear function of the elements of \({\bf{H}}_{t - 1} \), with the coefficients being a complicated quadratic in \(\bf{B}\). (Similarly for the "ARCH" term). It is possible to use SUMMARIZE to compute approximate standard errors for the VECH representation coefficients, but there is very little useful information in those.
Output
MV-GARCH, BEKK - Estimation by BFGS
Convergence in 86 Iterations. Final criterion was 0.0000086 <= 0.0000100
Usable Observations 6236
Log Likelihood -11821.7457
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Mean(XJPN) 0.005284238 0.005779110 0.91437 0.36052308
2. Mean(XFRA) -0.002360430 0.004150748 -0.56868 0.56957616
3. Mean(XSUI) -0.002505826 0.004919824 -0.50933 0.61051925
4. C(1,1) 0.082827551 0.005082889 16.29537 0.00000000
5. C(2,1) 0.029966933 0.006774462 4.42352 0.00000971
6. C(2,2) 0.055802023 0.004748283 11.75204 0.00000000
7. C(3,1) 0.037995437 0.007654196 4.96400 0.00000069
8. C(3,2) -0.004017902 0.006806450 -0.59031 0.55498413
9. C(3,3) 0.058506480 0.006245748 9.36741 0.00000000
10. A(1,1) 0.359535262 0.012077169 29.76983 0.00000000
11. A(1,2) 0.102691494 0.009048917 11.34848 0.00000000
12. A(1,3) 0.111082248 0.011812194 9.40403 0.00000000
13. A(2,1) 0.038123247 0.014041254 2.71509 0.00662580
14. A(2,2) 0.403444341 0.016261862 24.80923 0.00000000
15. A(2,3) -0.066355330 0.013370391 -4.96286 0.00000069
16. A(3,1) -0.047522551 0.010449514 -4.54782 0.00000542
17. A(3,2) -0.125553482 0.012149506 -10.33404 0.00000000
18. A(3,3) 0.291344292 0.010526939 27.67607 0.00000000
19. B(1,1) 0.935272064 0.003791535 246.67373 0.00000000
20. B(1,2) -0.026717483 0.003105677 -8.60279 0.00000000
21. B(1,3) -0.028574502 0.004086792 -6.99192 0.00000000
22. B(2,1) -0.012475081 0.005562790 -2.24259 0.02492300
23. B(2,2) 0.909746081 0.006295125 144.51597 0.00000000
24. B(2,3) 0.029269416 0.005403727 5.41652 0.00000006
25. B(3,1) 0.016548608 0.004489013 3.68647 0.00022739
26. B(3,2) 0.048830173 0.004991829 9.78202 0.00000000
27. B(3,3) 0.946852761 0.004722801 200.48544 0.00000000
Copyright © 2025 Thomas A. Doan