Material Detail

Clustering Distributed Sensor Data Streams

Clustering Distributed Sensor Data Streams

This video was recorded at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Antwerp 2008. In this work we study the problem of continuously maintain a cluster structure over the data points generated by a sensor network. We propose DGClust, a new distributed algorithm which reduces both the dimensionality and the communication burdens, by allowing each local sensor to keep an online discretization of its data stream. Each new data point triggers a cell in this univariate grid, reflecting the current state of the data stream at the local site. Whenever a local site changes its state, it notifies the central server about the new state it is in. The central site keeps a small list of counters of the most frequent global states. A simple adaptive partitional clustering algorithm is applied to the frequent states central points, providing an anytime definition of the clusters centers. The approach is evaluated in the context of distributed sensor networks, presenting empirical and theoretical evidence of its advantages.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.