Material Detail

A Scalable Framework for Discovering Coherent Co-clusters in Noisy Data

A Scalable Framework for Discovering Coherent Co-clusters in Noisy Data

This video was recorded at 26th International Conference on Machine Learning (ICML), Montreal 2009. Clustering problems often involve datasets where only a part of the data is relevant to the problem, e.g., in microarray data anal- ysis only a subset of the genes show cohe- sive expressions within a subset of the con- ditions/features. The existence of a large number of non-informative data points and features makes it challenging to hunt for co- herent and meaningful clusters from such datasets. Additionally, since clusters could exist in different subspaces of the feature space, a co-clustering algorithm that simul- taneously clusters objects and features is of- ten more suitable as compared to one that is restricted to traditional "one-sided" clus- tering. We propose Robust Overlapping Co- Clustering (ROCC), a scalable and very ver- satile framework that addresses the problem of efficiently mining dense, arbitrarily posi- tioned, possibly overlapping co-clusters from large, noisy datasets. ROCC has several de- sirable properties that make it extremely well suited to a number of real life applications. 1

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.