Material Detail

Identifying Temporal Patterns and Key Players in Document Collections

This video was recorded at Machine Learning Summer School (MLSS), Taipei 2006. We consider the problem of analyzing the development of a document collection over time without requiring meaningful citation data. Given a collection of timestamped documents, we formulate and explore the following two questions. First, what are the main topics and how do these topics develop over time? Second, to gain insight into the dynamics driving this development, what are the documents and who are the authors that are most influential in this process? Unlike prior work in citation analysis, we propose methods addressing these questions without requiring the availability of citation data. The methods use only the text of the documents as input. Consequentially, they are applicable to a much wider range of document collections (email, blogs, etc.), most of which lack meaningful citation data. We evaluate our methods on the proceedings of the Neural Information Processing Systems (NIPS) conference. Even with the preliminary methods that we implemented, the results show that the methods are effective and that addressing the questions based on the text alone is feasible. In fact, the text-based methods sometimes even identify influential papers that are missed by citation analysis.

Keywords:: videolectures, ocwc, oec

Disciplines:

Science and Technology / Computer Science / Programming & Programming Languages

Go to Material

Bookmark / Add to Course ePortfolio

Create a Learning Exercise

Add Accessibility Information

Rate

Add a Comment

Quality

User Rating
Comments
Learning Exercises
Bookmark Collections
Course ePortfolios
Accessibility Info

Report Broken Link
Report as Inappropriate

More about this material

Material Type:: Presentation
Date Added to MERLOT:: February 10, 2015
Date Modified in MERLOT:: February 10, 2015
Author:: Benyah Shaparenko, Cornell University
Submitter:: The Open Education Consortium
Primary Audience:: College General Ed, College Lower Division, College Upper Division
Technical Format:: Video

Mobile Compatibility:: Not specified at this time
Language:: English
Cost Involved:: No
Source Code Available:: No
Creative Commons:: This work is licensed under a Attribution-NonCommercial-NoDerivs 3.0 United States