Material Detail

A note of caution regarding distances on graphs

A note of caution regarding distances on graphs

This video was recorded at 27th International Conference on Machine Learning (ICML), Haifa 2010. Non-geometric data is often represented in form of a graph where edges represent similarity or local relationships between instances. One elegant way to exploit the global structure of the graph is implemented by the commute distance (also known as resistance distance). Supposedly it has the property that vertices from the same cluster are "close" to each other whereas vertices from different clusters are "far" from each other. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an expression that does not take into account the structure of the graph and that is completely meaningless as a distance function on the graph. Consequently, the use of the raw commute distance for machine learning purposes is strongly discouraged for large graphs and in high dimensions. We suspect that a similar behavior holds for several other distances on graphs.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.