Research papers, master and doctoral theses about نمذجة موضوع

Application to Topic Modeling of Time- Stamped Documents and apply that for Algorithms

1577 - Aِl-Baath University 2014 ورقة بحثية

We have introduced a new applications for Dynamic Factor Graphs, consisting in topic modeling, text classification and information retrieval. DFGs are tailored here to sequences of time-stamped documents. Based on the auto-encoder architecture, our nonlinear multi-layer model is trained stage-wise to produce increasingly more compact representations of bags-ofwords at the document or paragraph level, thus performing a semantic analysis. It also incorporates simple temporal dynamics on the latent representations, to take advantage of the inherent (hierarchical) structure of sequences of documents, and can simultaneously perform a supervised classification or regression on document labels, which makes our approach unique. Learning this model is done by maximizing the joint likelihood of the encoding, decoding, dynamical and supervised modules, and is possible using an approximate and gradient-based maximum-a-posteriori inference. We demonstrate that by minimizing a weighted cross-entropy loss between his tograms of word occurrences and their reconstruction, we directly minimize the topic model perplexity, and show that our topic model obtains lower perplexity than the Latent Dirichlet Allocation on the NIPS and State of the Union datasets. We illustrate how the dynamical constraints help the learning while enabling to visualize the topic trajectory.

encoder topic modeling documents latent representations Expectation Maximization Algorithms Maximum Likelihood المرمز التلقائي خوارزمية تعظيم التوقع نمذجة موضوع وثائق تمثيلات كامنة تعظيم الاحتمالات المشتركة المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد