Material Detail

TCA: High Dimensional Principal Component Analysis for non-Gaussian Data

TCA: High Dimensional Principal Component Analysis for non-Gaussian Data

This video was recorded at 26th Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe 2012. We propose a high dimensional semiparametric scaleinvariant principle component analysis, named TCA, by utilize the natural connection between the elliptical distribution family and the principal component analysis. Elliptical distribution family includes many well-known multivariate distributions like multivariate t and logistic and it is extended to the metaelliptical by Fang (2002) using the copula techniques. In this paper we extend the meta-elliptical distribution family to a even larger family, called transelliptical. We prove that TCA can obtain a near-optimal s(log d/n)^{1/2} estimation consistency rate in the transelliptical distribution family, even if the distributions are very heavy-tailed, have infinite second moments, do not have densities and possess arbitrarily continuous marginal distributions. A feature selection result with explicit rate is also provided. TCA is also implemented in both numerical simulations and large-scale stock data to illustrate its empirical performance. Both theories and experiments confirm that TCA can achieve model flexibility, estimation accuracy and robustness at almost no cost.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.