Material Detail

Distributed Data Mining

Distributed Data Mining

This video was recorded at Advanced Course on AI (ACAI), Ljubljana 2005. Data mining is the automated analysis of large volumes of data looking for relationships and knowledge that are implicit in data. Data mining and knowledge discovery in large amounts of data can benefit from the use of parallel and distributed computational environments to improve both performance and quality of data selection. The goal of this tutorial is to provide researchers and practitioners with an introduction to mining large data sets by exploiting techniques from high performance parallel and distributed computing. This tutorial is organized in two parts. In the first part an introduction to high performance parallel and distributed computing is provided. Different forms of parallelism that can be exploited in data mining techniques and algorithms are analyzed. The second part presents a review of distributed data mining approaches. For each data mining technique, different ways for parallel implementation are presented and discussed. Furthermore, parallel and distributed data mining systems and algorithms are discussed. Finally, current research issues and perspectives in high-performance data mining are outlined.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.