TextLuas: Tracking and Visualizing Document and Term Clusters in Dynamic Text Data


Abstract in English

For large volumes of text data collected over time, a key knowledge discovery task is identifying and tracking clusters. These clusters may correspond to emerging themes, popular topics, or breaking news stories in a corpus. Therefore, recently there has been increased interest in the problem of clustering dynamic data. However, there exists little support for the interactive exploration of the output of these analysis techniques, particularly in cases where researchers wish to simultaneously explore both the change in cluster structure over time and the change in the textual content associated with clusters. In this paper, we propose a model for tracking dynamic clusters characterized by the evolutionary events of each cluster. Motivated by this model, the TextLuas system provides an implementation for tracking these dynamic clusters and visualizing their evolution using a metro map metaphor. To provide overviews of cluster content, we adapt the tag cloud representation to the dynamic clustering scenario. We demonstrate the TextLuas system on two different text corpora, where they are shown to elucidate the evolution of key themes. We also describe how TextLuas was applied to a problem in bibliographic network research.

Download