Material Detail

Large-Scale Euclidean MST and Hierarchical Clustering

Large-Scale Euclidean MST and Hierarchical Clustering

This video was recorded at NIPS Workshop on Efficient Machine Learning, Whistler 2007. We present new fast algorithms for performing the single-linkage hierarchical clustering method, a classical data mining method used heavily in bioinformatics and astronomy, given similarities which are metrics. We present experimental results that demonstrate significant speedup over previous algorithms on both synthetic and real data, including a dataset of 3 million astronomical observations and a dataset of protein folding trajectories. Additionally, our algorithms use considerably less storage than previous methods. More generally, our algorithm appears to be the fastest practical solution to the well-known Euclidean Minimum Spanning Tree problem.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.