Material Detail

CASS: A distributed network clustering algorithm based on structure similarity for large-scale network

CASS: A distributed network clustering algorithm based on structure similarity for large-scale network

As the size of networks increases, it is becoming important to analyze large-scale network data. A network clustering algorithm is useful for analysis of network data. Conventional network clustering algorithms in a single machine environment rather than a parallel machine environment are actively being researched. However, these algorithms cannot analyze large-scale network data because of memory size issues. As a solution, we propose a network clustering algorithm for large-scale network data analysis using Apache Spark by changing the paradigm of the conventional clustering algorithm to improve its efficiency in the Apache Spark environment. We also apply optimization approaches such as Bloom filter and shuffle selection to reduce memory usage and execution time. By evaluating our... Show More
Rate

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Browse...

Disciplines with similar materials as CASS: A distributed network clustering algorithm based on structure similarity for large-scale network

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.