Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List

لفائف: إعادة النظر في المباراة المعجمية الدقيقة في استرجاع المعلومات مع القائمة المقلوبة من السياق

472 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

revisit exact lexical revisit exact exact lexical match إعادة النظر في المعجم الدقيق revisit بالضبط المطابقة المعجمية الدقيقة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Classical information retrieval systems such as BM25 rely on exact lexical match and can carry out search efficiently with inverted list index. Recent neural IR models shifts towards soft matching all query document terms, but they lose the computation efficiency of exact match systems. This paper presents COIL, a contextualized exact match retrieval architecture, where scoring is based on overlapping query document tokens' contextualized representations. The new architecture stores contextualized token representations in inverted lists, bringing together the efficiency of exact match and the representation power of deep language models. Our experimental results show COIL outperforms classical lexical retrievers and state-of-the-art deep LM retrievers with similar or smaller latency.

References used

https://aclanthology.org/

rate research

Revisiting the Uniform Information Density Hypothesis

683 - Association for Computation Linguistics 2021 مقالة

The uniform information density (UID) hypothesis posits a preference among language users for utterances structured such that information is distributed uniformly across a signal. While its implications on language production have been well explored, the hypothesis potentially makes predictions about language comprehension and linguistic acceptability as well. Further, it is unclear how uniformity in a linguistic signal---or lack thereof---should be measured, and over which linguistic unit, e.g., the sentence or language level, this uniformity should hold. Here we investigate these facets of the UID hypothesis using reading time and acceptability data. While our reading time results are generally consistent with previous work, they are also consistent with a weakly super-linear effect of surprisal, which would be compatible with UID's predictions. For acceptability judgments, we find clearer evidence that non-uniformity in information density is predictive of lower acceptability. We then explore multiple operationalizations of UID, motivated by different interpretations of the original hypothesis, and analyze the scope over which the pressure towards uniformity is exerted. The explanatory power of a subset of the proposed operationalizations suggests that the strongest trend may be a regression towards a mean surprisal across the language, rather than the phrase, sentence, or document---a finding that supports a typical interpretation of UID, namely that it is the byproduct of language users maximizing the use of a (hypothetical) communication channel.

uniform information density information density hypothesis information density كثافة المعلومات موحدة فرضية كثافة المعلومات كثافة المعلومات صناعة حمض الفوسفور المزيد..

Knowledge Discovery in Semantic Web (Information Retrieval from Knowledge Bases)

3341 - Tishreen University 2015 ورقة بحثية

Semantic Web is a new revolution in the world of the Web, where information and data become viable for logical processing by computer programs. Where they are transformed into meaningful data network. Although Semantic Web is considered the future of World Wide Web, the Arabic research and studies are still relatively rare in this field. Therefore, this paper gives a reference study of Semantic Web and the different methods to explore the knowledge and discover useful information from the vast amount of data provided by the web. It gives a programming example like application of some of these techniques provided by the Semantic Web and methods to discover the knowledge of it. This simplified programming example provides services related to higher education Syrian government, such as information about the Syrian public universities like the name of the university (Syrian Virtual University, Tishreen, Aleppo, Damascus, and Al Baath), address of the university, its web site, number of students and a summary of the university, which helps intelligent agents to find those services dynamically.

Ontology semantic web الويب الدلالي استكشاف المعارف الانطولوجيا التنقيب في الويب Web Mining Knowledge Discovery المزيد..

Integrating Lexical Information into Entity Neighbourhood Representations for Relation Prediction

666 - Association for Computation Linguistics 2021 مقالة

Relation prediction informed from a combination of text corpora and curated knowledge bases, combining knowledge graph completion with relation extraction, is a relatively little studied task. A system that can perform this task has the ability to ex tend an arbitrary set of relational database tables with information extracted from a document corpus. OpenKi[1] addresses this task through extraction of named entities and predicates via OpenIE tools then learning relation embeddings from the resulting entity-relation graph for relation prediction, outperforming previous approaches. We present an extension of OpenKi that incorporates embeddings of text-based representations of the entities and the relations. We demonstrate that this results in a substantial performance increase over a system without this information.

entity neighbourhood representations integrating lexical information entity neighbourhood تمثيل حي الكيان دمج المعلومات المعجمية حي الكيان صناعة حمض الفوسفور المزيد..

The Match-Extend serialization algorithm in Multiprecedence

661 - Association for Computation Linguistics 2021 مقالة

Raimy (1999; 2000a; 2000b) proposed a graphical formalism for modeling reduplication, originallymostly focused on phonological overapplication in a derivational framework. This framework is now known as Precedence-based phonology or Multiprecedence p honology. Raimy's idea is that the segments at the input to the phonology are not totally ordered by precedence. This paper tackles a challenge that arose with Raimy's work, the development of a deterministic serialization algorithm as part of the derivation of surface forms. The Match-Extend algorithm introduced here requires fewer assumptions and sticks tighter to the attested typology. The algorithm also contains no parameter or constraint specific to individual graphs or topologies, unlike previous proposals. Match-Extend requires nothing except knowing the last added set of links.

multiprecedence phonology multiprecedence serialization algorithm علم الصوت المتفقلي multiprecedence. خوارزمية التسلسل صناعة حمض الفوسفور المزيد..

Introducing Information Retrieval for Biomedical Informatics Students

756 - Association for Computation Linguistics 2021 مقالة

Introducing biomedical informatics (BMI) students to natural language processing (NLP) requires balancing technical depth with practical know-how to address application-focused needs. We developed a set of three activities introducing introductory BM I students to information retrieval with NLP, covering document representation strategies and language models from TF-IDF to BERT. These activities provide students with hands-on experience targeted towards common use cases, and introduce fundamental components of NLP workflows for a wide variety of applications.

biomedical informatics students biomedical informatics introducing biomedical informatics طلاب المعلوماتية الطبية الحيوية المعلوماتية الطبية الحيوية تقديم المعلوماتية الطبية الحيوية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List

لفائف: إعادة النظر في المباراة المعجمية الدقيقة في استرجاع المعلومات مع القائمة المقلوبة من السياق

Ask ChatGPT about the research

Read More

suggested questions