Material Detail

TopicFlow Model: Unsupervised Learning of Topic-specific Influences of Hyperlinked Documents

TopicFlow Model: Unsupervised Learning of Topic-specific Influences of Hyperlinked Documents

This video was recorded at 14th International Conference on Artificial Intelligence and Statistics (AISTATS), Ft. Lauderdale 2011. Popular algorithms for modeling the influence of entities in networked data, such as PageRank, work by analyzing the hyperlink structure, but ignore the contents of documents. However, often times, influence is topic dependent, e.g., a web page of high influence in politics may be an unknown entity in sports. We design a new model called TopicFlow, which combines ideas from network flow and topic modeling, to learn this notion of topic specific influences of hyperlinked documents in a completely unsupervised fashion. On the task of citation recommendation, which is an instance of capturing influence, the TopicFlow model, when combined with TF-IDF based cosine similarity, outperforms several competitive baselines by as much as 11.8%. Our empirical study of the model's output on ACL corpus demonstrates its ability to identify topically influential documents. The Topic- Flow model is also competitive with the state-of-theart Relational Topic Models in predicting the likelihood of unseen text on two different data sets. Due to its ability to learn topic-specific flows across each hyperlink, the TopicFlow model can be a powerful visualization tool to track the diffusion of topics across a citation network.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.