Personal tools
Home Projects Topic Modeling

Topic Modeling

Introduction and Background

Topic Modeling

As our collective knowledge continues to be digitized and stored in on-line documents, we need new tools for organizing and annotating them. "Topic modeling," uses a suite of new machine learning algorithms that examine texts to provide new methods of navigating digitized information. With topic models, we can search and explore a collection of documents based on the themes that run through it. We can "zoom in" and "zoom out" to find specific or broader themes; we can look at how those themes changed through time; we can see how themes are connected to each other. Topic models enable us to organize and summarize electronic archives at a scale that is impossible by human annotation.

This work has been developed in collaboration with David Blei an assistant professor in the Computer Science department at Princeton University. His research interests include:

  • Probabilistic graphical models and approximate posterior inference
  • Topic models, information retrieval, and text processing
  • Nonparametric Bayesian statistic

 

Description

A small example of the relationships created via topic modeling for the JSTOR Political Science discipline contents. You can explore the topics and relationships via a simple search interface or word index.

image.png

 

Contact Information

David Blei
Email:

Websites

http://showcase.jstor.org/blei/

http://www.cs.princeton.edu/~blei