ترغب بنشر مسار تعليمي؟ اضغط هنا

GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning

111   0   0.0 ( 0 )
 نشر من قبل Jiaqi Chen
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Automatic math problem solving has recently attracted increasing attention as a long-standing AI benchmark. In this paper, we focus on solving geometric problems, which requires a comprehensive understanding of textual descriptions, visual diagrams, and theorem knowledge. However, the existing methods were highly dependent on handcraft rules and were merely evaluated on small-scale datasets. Therefore, we propose a Geometric Question Answering dataset GeoQA, containing 5,010 geometric problems with corresponding annotated programs, which illustrate the solving process of the given problems. Compared with another publicly available dataset GeoS, GeoQA is 25 times larger, in which the program annotations can provide a practical testbed for future research on explicit and explainable numerical reasoning. Moreover, we introduce a Neural Geometric Solver (NGS) to address geometric problems by comprehensively parsing multimodal information and generating interpretable programs. We further add multiple self-supervised auxiliary tasks on NGS to enhance cross-modal semantic representation. Extensive experiments on GeoQA validate the effectiveness of our proposed NGS and auxiliary tasks. However, the results are still significantly lower than human performance, which leaves large room for future research. Our benchmark and code are released at https://github.com/chen-judge/GeoQA .

قيم البحث

اقرأ أيضاً

142 - Kwwabena Nuamah 2021
An important aspect of artificial intelligence (AI) is the ability to reason in a step-by-step algorithmic manner that can be inspected and verified for its correctness. This is especially important in the domain of question answering (QA). We argue that the challenge of algorithmic reasoning in QA can be effectively tackled with a systems approach to AI which features a hybrid use of symbolic and sub-symbolic methods including deep neural networks. Additionally, we argue that while neural network models with end-to-end training pipelines perform well in narrow applications such as image classification and language modelling, they cannot, on their own, successfully perform algorithmic reasoning, especially if the task spans multiple domains. We discuss a few notable exceptions and point out how they are still limited when the QA problem is widened to include other intelligence-requiring tasks. However, deep learning, and machine learning in general, do play important roles as components in the reasoning process. We propose an approach to algorithm reasoning for QA, Deep Algorithmic Question Answering (DAQA), based on three desirable properties: interpretability, generalizability and robustness which such an AI system should possess and conclude that they are best achieved with a combination of hybrid and compositional AI.
This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art language models (LM). We propose a distant supervision method to improve on this task. Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. Experiments show that further pretraining LMs on these automatically generated data significantly improves LMs capability on spatial understanding, which in turn helps to better solve two external datasets, bAbI, and boolQ. We hope that this work can foster investigations into more sophisticated models for spatial reasoning over text.
A key limitation in current datasets for multi-hop reasoning is that the required steps for answering the question are mentioned in it explicitly. In this work, we introduce StrategyQA, a question answering (QA) benchmark where the required reasoning steps are implicit in the question, and should be inferred using a strategy. A fundamental challenge in this setup is how to elicit such creative questions from crowdsourcing workers, while covering a broad range of potential strategies. We propose a data collection procedure that combines term-based priming to inspire annotators, careful control over the annotator population, and adversarial filtering for eliminating reasoning shortcuts. Moreover, we annotate each question with (1) a decomposition into reasoning steps for answering it, and (2) Wikipedia paragraphs that contain the answers to each step. Overall, StrategyQA includes 2,780 examples, each consisting of a strategy question, its decomposition, and evidence paragraphs. Analysis shows that questions in StrategyQA are short, topic-diverse, and cover a wide range of strategies. Empirically, we show that humans perform well (87%) on this task, while our best baseline reaches an accuracy of $sim$66%.
In the question answering(QA) task, multi-hop reasoning framework has been extensively studied in recent years to perform more efficient and interpretable answer reasoning on the Knowledge Graph(KG). However, multi-hop reasoning is inapplicable for a nswering n-ary fact questions due to its linear reasoning nature. We discover that there are two feasible improvements: 1) upgrade the basic reasoning unit from entity or relation to fact; and 2) upgrade the reasoning structure from chain to tree. Based on these, we propose a novel fact-tree reasoning framework, through transforming the question into a fact tree and performing iterative fact reasoning on it to predict the correct answer. Through a comprehensive evaluation on the n-ary fact KGQA dataset introduced by this work, we demonstrate that the proposed fact-tree reasoning framework has the desired advantage of high answer prediction accuracy. In addition, we also evaluate the fact-tree reasoning framework on two binary KGQA datasets and show that our approach also has a strong reasoning ability compared with several excellent baselines. This work has direct implications for exploring complex reasoning scenarios and provides a preliminary baseline approach.
Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا