Material Detail

Fuzzy Clustering of Documents

Fuzzy Clustering of Documents

This video was recorded at Slovenian KDD Conference on Data Mining and Data Warehouses (SiKDD), Ljubljana 2008. This paper presents a short overview of methods for fuzzy clustering and states desired properties for an optimal fuzzy document clustering algorithm. Based on these criteria we chose one of the fuzzy clustering most prominent methods – the c-means, more precisely probabilistic c-means. This algorithm is presented in more detail along with some empirical results of the clustering of 2-dimensional points and documents. For the needs of documents clustering we implemented fuzzy c-means in the TextGarden environment. We show few difficulties with the implementation and their possible solutions. As a conclusion we also propose further work that would be needed in order to fully exploit the power of fuzzy document clustering in TextGarden.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.