Scalable and Robust Construction of Topical Hierarchies


Abstract in English

Automated generation of high-quality topical hierarchies for a text collection is a dream problem in knowledge engineering with many valuable applications. In this paper a scalable and robust algorithm is proposed for constructing a hierarchy of topics from a text collection. We divide and conquer the problem using a top-down recursive framework, based on a tensor orthogonal decomposition technique. We solve a critical challenge to perform scalable inference for our newly designed hierarchical topic model. Experiments with various real-world datasets illustrate its ability to generate robust, high-quality hierarchies efficiently. Our method reduces the time of construction by several orders of magnitude, and its robust feature renders it possible for users to interactively revise the hierarchy.

Download