I'm wondering if anyone has worked on implementing the fast-MCD (or the newer det-MCD, in the 2010 working paper from Hubert, Rousseeuw, and Verdonck) algorithm for robust estimation of covariance matrices. (This would be an input, for example, into a solid multivariate outlier detection routine, or into a canonical correlation analysis that is not distorted by outliers. And that's just two applications.)
There is matlab code here (http://www.mathworks.com/matlabcentral/ ... /fastmcd.m), and the algorithm is described informally here (https://tr8dr.wordpress.com/2010/09/24/ ... rmination/).
Eric Blankmeyer coded up an MCD routine in mcd.src, which is in the BC software repository (http://econpapers.repec.org/software/bo ... 931601.htm). At first glance, it appears to implement fast-MCD ... but I find that this proc takes forever if the number of variables is bigger than 3 or 4.
By the way, det-MCD is much faster, and is supposedly available in Matlab as part of LIBRA. (Also, there are supposedly implementations of both algorithms in R.) det-MCD will involve more coding, since it explicitly starts with 7 distinct estimates of the covariance matrix before starting iteration steps (which are then, I believe, done the same way for each of the 7).
PS. by the way, I am thinking that robust canonical correlation might be the basis for a robust multivariate Granger causality test. Gelper and Croux (2007?), cited in Forni and Gambetti's JME paper, got me thinking along these lines. (But *their* canonical correlation test does not seem to be standard.)
PPS. just read an article in a stats journal (Zhang, Olive and Ye, Robust Covariance Matrix Estimation with Canonical Correlation Analysis, 2012). Evidently there is no large-sample theory for the fast-MCD algorithm, only for ordinary MCD ...which is not practical. Moreover in this study, fast-MCD failed spectacularly on two of the data sets examined. One conclusion they have: "F-MCD does not work well as a robust CCA technique." The study supports the "RMVN" estimator which is based upon the "FCH" estimator. They assert that the algorithm is not only more accurate, but is two orders of magnitude faster than fast-MCD.
There is R code at (http://lagrange.math.siu.edu/Olive/mpack.txt) ....though this page consists of a ton of R code, not just code for the RMVN estimator.
MCD and fast-MCD algorithm (or det-MCD or FCH)
-
randal_verbrugge
- Posts: 14
- Joined: Mon Sep 23, 2013 10:43 am