Negative Statements Considered Useful

109 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hiba Arnaout

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Hiba Arnaout - Simon Razniewski - Gerhard Weikum

استرجاع المعلومات الذكاء الاصطناعي الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Knowledge bases (KBs) about notable entities and their properties are an important asset in applications such as search, question answering and dialogue. All popular KBs capture virtually only positive statements, and abstain from taking any stance on statements not stored in the KB. This paper makes the case for explicitly stating salient statements that do not hold. Negative statements are useful to overcome limitations of question answering systems that are mainly geared for positive questions; they can also contribute to informative summaries of entities. Due to the abundance of such invalid statements, any effort to compile them needs to address ranking by saliency. We present a statisticalinference method for compiling and ranking negative statements, based on expectations from positive statements of related entities in peer groups. Experimental results, with a variety of datasets, show that the method can effectively discover notable negative statements, and extrinsic studies underline their usefulness for entity summarization. Datasets and code are released as resources for further research.

قيم البحث

اقرأ أيضاً

Stanford Matrix Considered Harmful

356 - Sebastiano Vigna 2007

This note argues about the validity of web-graph data used in the literature.

استرجاع المعلومات

Hard negative examples are hard, but useful

100 - Hong Xuan , Abby Stylianou , Xiaotong Liu 2020

Triplet loss is an extremely common approach to distance metric learning. Representations of images from the same class are optimized to be mapped closer together in an embedding space than representations of images from different classes. Much work on triplet losses focuses on selecting the most useful triplets of images to consider, with strategies that select dissimilar examples from the same class or similar examples from different classes. The consensus of previous research is that optimizing with the textit{hardest} negative examples leads to bad training behavior. Thats a problem -- these hardest negatives are literally the cases where the distance metric fails to capture semantic similarity. In this paper, we characterize the space of triplets and derive why hard negatives make triplet loss training fail. We offer a simple fix to the loss function and show that, with this fix, optimizing with hard negative examples becomes feasible. This leads to more generalizable features, and image retrieval results that outperform state of the art for datasets with high intra-class variance.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Fine-Grained Named Entity Recognition using ELMo and Wikidata

109 - Cihan Dogan , Aimore Dutra , Adam Gara 2019

Fine-grained Named Entity Recognition is a task whereby we detect and classify entity mentions to a large set of types. These types can span diverse domains such as finance, healthcare, and politics. We observe that when the type set spans several do mains the accuracy of the entity detection becomes a limitation for supervised learning models. The primary reason being the lack of datasets where entity boundaries are properly annotated, whilst covering a large spectrum of entity types. Furthermore, many named entity systems suffer when considering the categorization of fine grained entity types. Our work attempts to address these issues, in part, by combining state-of-the-art deep learning models (ELMo) with an expansive knowledge base (Wikidata). Using our framework, we cross-validate our model on the 112 fine-grained entity types based on the hierarchy given from the Wiki(gold) dataset.

استرجاع المعلومات الذكاء الاصطناعي الحساب واللغة

Integrating Social Media into a Pan-European Flood Awareness System: A Multilingual Approach

59 - V. Lorini , Ispra 2019

This paper describes a prototype system that integrates social media analysis into the European Flood Awareness System (EFAS). This integration allows the collection of social media data to be automatically triggered by flood risk warnings determined by a hydro-meteorological model. Then, we adopt a multi-lingual approach to find flood-related messages by employing two state-of-the-art methodologies: language-agnostic word embeddings and language-aligned word embeddings. Both approaches can be used to bootstrap a classifier of social media messages for a new language with little or no labeled data. Finally, we describe a method for selecting relevant and representative messages and displaying them back in the interface of EFAS.

استرجاع المعلومات الذكاء الاصطناعي الحساب واللغة

TAPAS: Weakly Supervised Table Parsing via Pre-training

535 - Jonathan Herzig , Pawe{l} Krzysztof Nowak , Thomas Muller 2020

Answering natural language questions over tables is usually seen as a semantic parsing task. To alleviate the collection cost of full logical forms, one popular approach focuses on weak supervision consisting of denotations instead of logical forms. However, training semantic parsers from weak supervision poses difficulties, and in addition, the generated logical forms are only used as an intermediate step prior to retrieving the denotation. In this paper, we present TAPAS, an approach to question answering over tables without generating logical forms. TAPAS trains from weak supervision, and predicts the denotation by selecting table cells and optionally applying a corresponding aggregation operator to such selection. TAPAS extends BERTs architecture to encode tables as input, initializes from an effective joint pre-training of text segments and tables crawled from Wikipedia, and is trained end-to-end. We experiment with three different semantic parsing datasets, and find that TAPAS outperforms or rivals semantic parsing models by improving state-of-the-art accuracy on SQA from 55.1 to 67.2 and performing on par with the state-of-the-art on WIKISQL and WIKITQ, but with a simpler model architecture. We additionally find that transfer learning, which is trivial in our setting, from WIKISQL to WIKITQ, yields 48.7 accuracy, 4.2 points above the state-of-the-art.

استرجاع المعلومات الذكاء الاصطناعي الحساب واللغة