أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Erfan Sadeqi Azer

Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses

219 - Erfan Sadeqi Azer , Daniel Khashabi , Ashish Sabharwal 2019

Empirical research in Natural Language Processing (NLP) has adopted a narrow set of principles for assessing hypotheses, relying mainly on p-value computation, which suffers from several known issues. While alternative proposals have been well-debate d and adopted in other fields, they remain rarely discussed or used within the NLP community. We address this gap by contrasting various hypothesis assessment techniques, especially those not commonly used in the field (such as evaluations based on Bayesian inference). Since these statistical techniques differ in the hypotheses they can support, we argue that practitioners should first decide their target hypothesis before choosing an assessment method. This is crucial because common fallacies, misconceptions, and misinterpretation surrounding hypothesis assessment methods often stem from a discrepancy between what one would like to claim versus what the method used actually assesses. Our survey reveals that these issues are omnipresent in the NLP research community. As a step forward, we provide best practices and guidelines tailored to NLP research, as well as an easy-to-use package called HyBayes for Bayesian assessment of hypotheses, complementing existing tools.

الحساب واللغة الذكاء الاصطناعي

On the Possibilities and Limitations of Multi-hop Reasoning Under Linguistic Imperfections

296 - Daniel Khashabi , Erfan Sadeqi Azer , Tushar Khot 2019

Systems for language understanding have become remarkably strong at overcoming linguistic imperfections in tasks involving phrase matching or simple reasoning. Yet, their accuracy drops dramatically as the number of reasoning steps increases. We pres ent the first formal framework to study such empirical observations. It allows one to quantify the amount and effect of ambiguity, redundancy, incompleteness, and inaccuracy that the use of language introduces when representing a hidden conceptual space. The idea is to consider two interrelated spaces: a conceptual meaning space that is unambiguous and complete but hidden, and a linguistic space that captures a noisy grounding of the meaning space in the words of a language---the level at which all systems, whether neural or symbolic, operate. Applying this framework to a special class of multi-hop reasoning, namely the connectivity problem in graphs of relationships between concepts, we derive rigorous intuitions and impossibility results even under this simplified setting. For instance, if a query requires a moderately large (logarithmic) number of hops in the meaning graph, no reasoning system operating over a noisy graph grounded in language is likely to correctly answer it. This highlights a fundamental barrier that extends to a broader class of reasoning problems and systems, and suggests an alternative path forward: focusing on aligning the two spaces via richer representations, before investing in reasoning with many hops.

الحساب واللغة الذكاء الاصطناعي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد