Material Detail

Robust PCA and Collaborative Filtering: Rejecting Outliers, Identifying Manipulators

Robust PCA and Collaborative Filtering: Rejecting Outliers, Identifying Manipulators

This video was recorded at NIPS Workshops, Whistler 2010. Principal Component Analysis is one of the most widely used techniques for dimensionality reduction. Nevertheless, it is plagued by sensitivity to outliers; finding robust analogs, particularly for high-dimensional data, is critical. We discuss the challenges posed by the high dimensional setting, where dimensionality is of the same order, or greater, than the number of samples. We detail why existing techniques fail -- indeed, no known algorithm can provide provable bounds to any constant fraction of outliers -- and then present two very different algorithms for High Dimensional Robust PCA. Our first algorithm achieves a breakdown point of 50% -- the best possible using any algorithm, and a stark improvement from the previous best-known result of 0%. Our second algorithm is based on ideas from convex optimization, and in addition to recovering the principal components, is also able to identify the corrupted points. We extend this to the partially observed setting, significantly extending matrix completion results to the setting of corrupted rows or columns.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.