Do you want to publish a course? Click here

Application to Topic Modeling of Time- Stamped Documents and apply that for Algorithms

تطبيق لنمذجة موضوع وثائق ذات طابع زمني و تطبيق ذلك على الخوارزميات

1578   0   10   0 ( 0 )
 Publication date 2014
and research's language is العربية
 Created by Shamra Editor




Ask ChatGPT about the research

We have introduced a new applications for Dynamic Factor Graphs, consisting in topic modeling, text classification and information retrieval. DFGs are tailored here to sequences of time-stamped documents. Based on the auto-encoder architecture, our nonlinear multi-layer model is trained stage-wise to produce increasingly more compact representations of bags-ofwords at the document or paragraph level, thus performing a semantic analysis. It also incorporates simple temporal dynamics on the latent representations, to take advantage of the inherent (hierarchical) structure of sequences of documents, and can simultaneously perform a supervised classification or regression on document labels, which makes our approach unique. Learning this model is done by maximizing the joint likelihood of the encoding, decoding, dynamical and supervised modules, and is possible using an approximate and gradient-based maximum-a-posteriori inference. We demonstrate that by minimizing a weighted cross-entropy loss between his tograms of word occurrences and their reconstruction, we directly minimize the topic model perplexity, and show that our topic model obtains lower perplexity than the Latent Dirichlet Allocation on the NIPS and State of the Union datasets. We illustrate how the dynamical constraints help the learning while enabling to visualize the topic trajectory.

References used
Deerwester, S., Dumais, S., Furnas, G., Landauer, T. and Harshman, R.(1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407
Kolenda, T. and Kai Hansen, L. (2000). Independent components in text. In Advances in Independent Component Analysis
Gehler, P., Holub, A. and Welling, M. (2006). The rate adapting poisson model for information retrieval and object recognition. In ICML
Salakhutdinov, R. and Hinton, G. (2009). Replicated softmax. In ICML
Blei, D., Ng, A. and Jordan, M. (2003). Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022
rate research

Read More

The research aims mainly to study the method of Benchmarking as a mean for continuous improvement of quality and the possibility of its usage in the Syrian banks, and to figure out any obstacles for such application therefore finding the right solutions.
This paper applies topic modeling to understand maternal health topics, concerns, and questions expressed in online communities on social networking sites. We examine Latent Dirichlet Analysis (LDA) and two state-of-the-art methods: neural topic mode l with knowledge distillation (KD) and Embedded Topic Model (ETM) on maternal health texts collected from Reddit. The models are evaluated on topic quality and topic inference, using both auto-evaluation metrics and human assessment. We analyze a disconnect between automatic metrics and human evaluations. While LDA performs the best overall with the auto-evaluation metrics NPMI and Coherence, Neural Topic Model with Knowledge Distillation is favorable by expert evaluation. We also create a new partially expert annotated gold-standard maternal health topic
Given a heterogeneous social network, can we forecast its future? Can we predict who will start using a given hashtag on twitter? Can we leverage side information, such as who retweets or follows whom, to improve our membership forecasts? We present TENSORCAST, a novel method that forecasts time-evolving networks more accurately than the current state of the art methods by incorporating multiple data sources in coupled tensors. TENSORCAST is (a) scalable, being linearithmic on the number of connections; (b) effective, achieving over 20% improved precision on top-1000 forecasts of community members; (c) general, being applicable to data sources with a different structure. We run our method on multiple real-world networks, including DBLP and a Twitter temporal network with over 310 million nonzeros, where we predict the evolution of the activity of the use of political hashtags.
Broader disclosive transparency---truth and clarity in communication regarding the function of AI systems---is widely considered desirable. Unfortunately, it is a nebulous concept, difficult to both define and quantify. This is problematic, as previo us work has demonstrated possible trade-offs and negative consequences to disclosive transparency, such as a confusion effect, where too much information'' clouds a reader's understanding of what a system description means. Disclosive transparency's subjective nature has rendered deep study into these problems and their remedies difficult. To improve this state of affairs, We introduce neural language model-based probabilistic metrics to directly model disclosive transparency, and demonstrate that they correlate with user and expert opinions of system transparency, making them a valid objective proxy. Finally, we demonstrate the use of these metrics in a pilot study quantifying the relationships between transparency, confusion, and user perceptions in a corpus of real NLP system descriptions.
Time-offset interaction applications (TOIA) allow simulating conversations with people who have previously recorded relevant video utterances, which are played in response to their interacting user. TOIAs have great potential for preserving cross-gen erational and cross-cultural histories, online teaching, simulated interviews, etc. Current TOIAs exist in niche contexts involving high production costs. Democratizing TOIA presents different challenges when creating appropriate pre-recordings, designing different user stories, and creating simple online interfaces for experimentation. We open-source TOIA 2.0, a user-centered time-offset interaction application, and make it available for everyone who wants to interact with people's pre-recordings, or create their pre-recordings.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا