Portfolio

Efficiently tuning topic models

In natural language processing, topic models are used to extract meaningful and human-interpretable topics from a corpus. However, tuning topic models for large corpora can be time consuming and computationally expensive. By monitoring topic coherence as a function of corpus size, we can determine how to efficiently create a high quality topic model. In this project, we will demonstrate this technique using the English Wikipedia corpus.