Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation

100 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Pengjie Ren

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Pengjie Ren - Zhumin Chen - Christof Monz

الحساب واللغة الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Background Based Conversations (BBCs) have been introduced to help conversational systems avoid generating overly generic responses. In a BBC, the conversation is grounded in a knowledge source. A key challenge in BBCs is Knowledge Selection (KS): given a conversational context, try to find the appropriate background knowledge (a text fragment containing related facts or comments, etc.) based on which to generate the next response. Previous work addresses KS by employing attention and/or pointer mechanisms. These mechanisms use a local perspective, i.e., they select a token at a time based solely on the current decoding state. We argue for the adoption of a global perspective, i.e., pre-selecting some text fragments from the background knowledge that could help determine the topic of the next response. We enhance KS in BBCs by introducing a Global-to-Local Knowledge Selection (GLKS) mechanism. Given a conversational context and background knowledge, we first learn a topic transition vector to encode the most likely text fragments to be used in the next response, which is then used to guide the local KS at each decoding timestamp. In order to effectively learn the topic transition vector, we propose a distantly supervised learning schema. Experimental results show that the GLKS model significantly outperforms state-of-the-art methods in terms of both automatic and human evaluation. More importantly, GLKS achieves this without requiring any extra annotations, which demonstrates its high degree of scalability.

قيم البحث

231 - B. L. Higgins , A. C. Doherty , S. D. Bartlett 2010

We theoretically investigate schemes to discriminate between two nonorthogonal quantum states given multiple copies. We consider a number of state discrimination schemes as applied to nonorthogonal, mixed states of a qubit. In particular, we examine the difference that local and global optimization of local measurements makes to the probability of obtaining an erroneous result, in the regime of finite numbers of copies $N$, and in the asymptotic limit as $N rightarrow infty$. Five schemes are considered: optimal collective measurements over all copies, locally optimal local measurements in a fixed single-qubit measurement basis, globally optimal fixed local measurements, locally optimal adaptive local measurements, and globally optimal adaptive local measurements. Here, adaptive measurements are those for which the measurement basis can depend on prior measurement results. For each of these measurement schemes we determine the probability of error (for finite $N$) and scaling of this error in the asymptotic limit. In the asymptotic limit, adaptive schemes have no advantage over the optimal fixed local scheme, and except for states with less than 2% mixture, the most naive scheme (locally optimal fixed local measurements) is as good as any noncollective scheme. For finite $N$, however, the most sophisticated local scheme (globally optimal adaptive local measurements) is better than any other noncollective scheme, for any degree of mixture.

فيزياء الكم

Working Locally Thinking Globally: Theoretical Guarantees for Convolutional Sparse Coding

118 - Vardan Papyan , Jeremias Sulam , Michael Elad 2017

The celebrated sparse representation model has led to remarkable results in various signal processing tasks in the last decade. However, despite its initial purpose of serving as a global prior for entire signals, it has been commonly used for modeli ng low dimensional patches due to the computational constraints it entails when deployed with learned dictionaries. A way around this problem has been recently proposed, adopting a convolutional sparse representation model. This approach assumes that the global dictionary is a concatenation of banded Circulant matrices. While several works have presented algorithmic solutions to the global pursuit problem under this new model, very few truly-effective guarantees are known for the success of such methods. In this work, we address the theoretical aspects of the convolutional sparse model providing the first meaningful answers to questions of uniqueness of solutions and success of pursuit algorithms, both greedy and convex relaxations, in ideal and noisy regimes. To this end, we generalize mathematical quantities, such as the $ell_0$ norm, mutual coherence, Spark and RIP to their counterparts in the convolutional setting, intrinsically capturing local measures of the global model. On the algorithmic side, we demonstrate how to solve the global pursuit problem by using simple local processing, thus offering a first of its kind bridge between global modeling of signals and their patch-based local treatment.

نظرية المعلومات نظرية المعلومات

DESCGEN: A Distantly Supervised Dataset for Generating Abstractive Entity Descriptions

114 - Weijia Shi , Mandar Joshi , Luke Zettlemoyer 2021

Short textual descriptions of entities provide summaries of their key attributes and have been shown to be useful sources of background knowledge for tasks such as entity linking and question answering. However, generating entity descriptions, especi ally for new and long-tail entities, can be challenging since relevant information is often scattered across multiple sources with varied content and style. We introduce DESCGEN: given mentions spread over multiple documents, the goal is to generate an entity summary description. DESCGEN consists of 37K entity descriptions from Wikipedia and Fandom, each paired with nine evidence documents on average. The documents were collected using a combination of entity linking and hyperlinks to the Wikipedia and Fandom entity pages, which together provide high-quality distant supervision. The resulting summaries are more abstractive than those found in existing datasets and provide a better proxy for the challenge of describing new and emerging entities. We also propose a two-stage extract-then-generate baseline and show that there exists a large gap (19.9% in ROUGE-L) between state-of-the-art models and human performance, suggesting that the data will support significant future work.

الحساب واللغة الذكاء الاصطناعي

Populating Web Scale Knowledge Graphs using Distantly Supervised Relation Extraction and Validation

102 - Sarthak Dash , Michael R. Glass , Alfio Gliozzo 2019

In this paper, we propose a fully automated system to extend knowledge graphs using external information from web-scale corpora. The designed system leverages a deep learning based technology for relation extraction that can be trained by a distantly supervised approach. In addition to that, the system uses a deep learning approach for knowledge base completion by utilizing the global structure information of the induced KG to further refine the confidence of the newly discovered relations. The designed system does not require any effort for adaptation to new languages and domains as it does not use any hand-labeled data, NLP analytics and inference rules. Our experiments, performed on a popular academic benchmark demonstrate that the suggested system boosts the performance of relation extraction by a wide margin, reporting error reductions of 50%, resulting in relative improvement of up to 100%. Also, a web-scale experiment conducted to extend DBPedia with knowledge from Common Crawl shows that our system is not only scalable but also does not require any adaptation cost, while yielding substantial accuracy gain.

الحساب واللغة

Working Locally Thinking Globally - Part I: Theoretical Guarantees for Convolutional Sparse Coding

261 - Vardan Papyan , Jeremias Sulam , Michael Elad 2016

The celebrated sparse representation model has led to remarkable results in various signal processing tasks in the last decade. However, despite its initial purpose of serving as a global prior for entire signals, it has been commonly used for modeli ng low dimensional patches due to the computational constraints it entails when deployed with learned dictionaries. A way around this problem has been proposed recently, adopting a convolutional sparse representation model. This approach assumes that the global dictionary is a concatenation of banded Circulant matrices. Although several works have presented algorithmic solutions to the global pursuit problem under this new model, very few truly-effective guarantees are known for the success of such methods. In the first of this two-part work, we address the theoretical aspects of the sparse convolutional model, providing the first meaningful answers to corresponding questions of uniqueness of solutions and success of pursuit algorithms. To this end, we generalize mathematical quantities, such as the $ell_0$ norm, the mutual coherence and the Spark, to their counterparts in the convolutional setting, which intrinsically capture local measures of the global model. In a companion paper, we extend the analysis to a noisy regime, addressing the stability of the sparsest solutions and pursuit algorithms, and demonstrate practical approaches for solving the global pursuit problem via simple local processing.

نظرية المعلومات نظرية المعلومات