ﻻ يوجد ملخص باللغة العربية
Despite significant interest in developing general purpose fact checking models, it is challenging to construct a large-scale fact verification dataset with realistic claims that would occur in the real world. Existing claims are either authored by crowdworkers, thereby introducing subtle biases that are difficult to control for, or manually verified by professional fact checkers, causing them to be expensive and limited in scale. In this paper, we construct a challenging, realistic, and large-scale fact verification dataset called FaVIQ, using information-seeking questions posed by real users who do not know how to answer. The ambiguity in information-seeking questions enables automatically constructing true and false claims that reflect confusions arisen from users (e.g., the year of the movie being filmed vs. being released). Our claims are verified to be natural, contain little lexical bias, and require a complete understanding of the evidence for verification. Our experiments show that the state-of-the-art models are far from solving our new task. Moreover, training on our data helps in professional fact-checking, outperforming models trained on the most widely used dataset FEVER or in-domain data by up to 17% absolute. Altogether, our data will serve as a challenging benchmark for natural language understanding and support future progress in professional fact checking.
Neural models for automated fact verification have achieved promising results thanks to the availability of large, human-annotated datasets. However, for each new domain that requires fact verification, creating a dataset by manually writing claims a
Given a natural language statement, how to verify whether it is supported, refuted, or unknown according to a large-scale knowledge source like Wikipedia? Existing neural-network-based methods often regard a sentence as a whole. While we argue that i
Tables provide valuable knowledge that can be used to verify textual statements. While a number of works have considered table-based fact verification, direct alignments of tabular data with tokens in textual statements are rarely available. Moreover
Readers of academic research papers often read with the goal of answering specific questions. Question Answering systems that can answer those questions can make consumption of the content much more efficient. However, building such tools requires da
The increasing concern with misinformation has stimulated research efforts on automatic fact checking. The recently-released FEVER dataset introduced a benchmark fact-verification task in which a system is asked to verify a claim using evidential sen