Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Is that really a question? Going beyond factoid questions in NLP

هل هذا هو حقا سؤال؟الذهاب وراء الأسئلة العفاهية في NLP

624 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Research in NLP has mainly focused on factoid questions, with the goal of finding quick and reliable ways of matching a query to an answer. However, human discourse involves more than that: it contains non-canonical questions deployed to achieve specific communicative goals. In this paper, we investigate this under-studied aspect of NLP by introducing a targeted task, creating an appropriate corpus for the task and providing baseline models of diverse nature. With this, we are also able to generate useful insights on the task and open the way for future research in this direction.

References used

https://aclanthology.org/

rate research

Are NLP Models really able to Solve Simple Math Word Problems?

871 - Association for Computation Linguistics 2021 مقالة

The problem of designing NLP solvers for math word problems (MWP) has seen sustained research activity and steady gains in the test accuracy. Since existing solvers achieve high performance on the benchmark datasets for elementary level MWPs containi ng one-unknown arithmetic word problems, such problems are often considered solved'' with the bulk of research attention moving to more complex MWPs. In this paper, we restrict our attention to English MWPs taught in grades four and lower. We provide strong evidence that the existing MWP solvers rely on shallow heuristics to achieve high performance on the benchmark datasets. To this end, we show that MWP solvers that do not have access to the question asked in the MWP can still solve a large fraction of MWPs. Similarly, models that treat MWPs as bag-of-words can also achieve surprisingly high accuracy. Further, we introduce a challenge dataset, SVAMP, created by applying carefully chosen variations over examples sampled from existing datasets. The best accuracy achieved by state-of-the-art models is substantially lower on SVAMP, thus showing that much remains to be done even for the simplest of the MWPs.

simple math word solve simple math كلمة الرياضيات البسيطة حل الرياضيات البسيطة صناعة حمض الفوسفور

What is SemEval evaluating? A Systematic Analysis of Evaluation Campaigns in NLP

790 - Association for Computation Linguistics 2021 مقالة

SemEval is the primary venue in the NLP community for the proposal of new challenges and for the systematic empirical evaluation of NLP systems. This paper provides a systematic quantitative analysis of SemEval aiming to evidence the patterns of the contributions behind SemEval. By understanding the distribution of task types, metrics, architectures, participation and citations over time we aim to answer the question on what is being evaluated by SemEval.

evaluation campaigns systematic empirical evaluation حملات التقييم التقييم التجريبي المنهجي صناعة حمض الفوسفور

Do We Know What We Don't Know? Studying Unanswerable Questions beyond SQuAD 2.0

587 - Association for Computation Linguistics 2021 مقالة

Understanding when a text snippet does not provide a sought after information is an essential part of natural language utnderstanding. Recent work (SQuAD 2.0; Rajpurkar et al., 2018) has attempted to make some progress in this direction by enriching the SQuAD dataset for the Extractive QA task with unanswerable questions. However, as we show, the performance of a top system trained on SQuAD 2.0 drops considerably in out-of-domain scenarios, limiting its use in practical situations. In order to study this we build an out-of-domain corpus, focusing on simple event-based questions and distinguish between two types of IDK questions: competitive questions, where the context includes an entity of the same type as the expected answer, and simpler, non-competitive questions where there is no entity of the same type in the context. We find that SQuAD 2.0-based models fail even in the case of the simpler questions. We then analyze the similarities and differences between the IDK phenomenon in Extractive QA and the Recognizing Textual Entailments task (RTE; Dagan et al., 2013) and investigate the extent to which the latter can be used to improve the performance.

studying unanswerable questions unanswerable questions دراسة أسئلة لا يمكن إجراؤها أسئلة لا يمكن إجراؤها صناعة حمض الفوسفور

Looking Beyond Sentence-Level Natural Language Inference for Question Answering and Text Summarization

903 - Association for Computation Linguistics 2021 مقالة

Natural Language Inference (NLI) has garnered significant attention in recent years; however, the promise of applying NLI breakthroughs to other downstream NLP tasks has remained unfulfilled. In this work, we use the multiple-choice reading comprehen sion (MCRC) and checking factual correctness of textual summarization (CFCS) tasks to investigate potential reasons for this. Our findings show that: (1) the relatively shorter length of premises in traditional NLI datasets is the primary challenge prohibiting usage in downstream applications (which do better with longer contexts); (2) this challenge can be addressed by automatically converting resource-rich reading comprehension datasets into longer-premise NLI datasets; and (3) models trained on the converted, longer-premise datasets outperform those trained using short-premise traditional NLI datasets on downstream tasks primarily due to the difference in premise lengths.

ترجمة اللغة الطبيعية sentence-level natural language اللغة الطبيعية على مستوى الجملة صناعة حمض الفوسفور

Applying Occam's Razor to Transformer-Based Dependency Parsing: What Works, What Doesn't, and What is Really Necessary

723 - Association for Computation Linguistics 2021 مقالة

The introduction of pre-trained transformer-based contextualized word embeddings has led to considerable improvements in the accuracy of graph-based parsers for frameworks such as Universal Dependencies (UD). However, previous works differ in various dimensions, including their choice of pre-trained language models and whether they use LSTM layers. With the aims of disentangling the effects of these choices and identifying a simple yet widely applicable architecture, we introduce STEPS, a new modular graph-based dependency parser. Using STEPS, we perform a series of analyses on the UD corpora of a diverse set of languages. We find that the choice of pre-trained embeddings has by far the greatest impact on parser performance and identify XLM-R as a robust choice across the languages in our study. Adding LSTM layers provides no benefits when using transformer-based embeddings. A multi-task training setup outputting additional UD features may contort results. Taking these insights together, we propose a simple but widely applicable parser architecture and configuration, achieving new state-of-the-art results (in terms of LAS) for 10 out of 12 diverse languages.

applying occam razor occam razor transformer-based dependency parsing تطبيق Occam Razor. accam الحلاقة تحليل التبعية القائمة على المحولات صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Is that really a question? Going beyond factoid questions in NLP

هل هذا هو حقا سؤال؟الذهاب وراء الأسئلة العفاهية في NLP

Ask ChatGPT about the research

Read More

suggested questions