New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Spurious Correlations in Cross-Topic Argument Mining

الارتباطات الزائفة في تعدين الوسيطة عبر الموضوع

273 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

cross-topic argument mining argument mining cross-topic argument تعدين الوسائط عبر الموضوع حجة التعدين حجة موضوعية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recent work in cross-topic argument mining attempts to learn models that generalise across topics rather than merely relying on within-topic spurious correlations. We examine the effectiveness of this approach by analysing the output of single-task and multi-task models for cross-topic argument mining, through a combination of linear approximations of their decision boundaries, manual feature grouping, challenge examples, and ablations across the input vocabulary. Surprisingly, we show that cross-topic models still rely mostly on spurious correlations and only generalise within closely related topics, e.g., a model trained only on closed-class words and a few common open-class words outperforms a state-of-the-art cross-topic model on distant target topics.

References used

https://aclanthology.org/

rate research

Influence Tuning: Demoting Spurious Correlations via Instance Attribution and Instance-Driven Updates

220 - Association for Computation Linguistics 2021 مقالة

Among the most critical limitations of deep learning NLP models are their lack of interpretability, and their reliance on spurious correlations. Prior work proposed various approaches to interpreting the black-box models to unveil the spurious correl ations, but the research was primarily used in human-computer interaction scenarios. It still remains underexplored whether or how such model interpretations can be used to automatically unlearn'' confounding features. In this work, we propose influence tuning---a procedure that leverages model interpretations to update the model parameters towards a plausible interpretation (rather than an interpretation that relies on spurious patterns in the data) in addition to learning to predict the task labels. We show that in a controlled setup, influence tuning can help deconfounding the model from spurious patterns in data, significantly outperforming baseline methods that use adversarial training.

demoting spurious correlations instance attribution attribution and instance-driven إزالة الارتباطات الزائفة نسخة الإسناد الإسناد والمثال صناعة حمض الفوسفور المزيد..

SpanAlign: Efficient Sequence Tagging Annotation Projection into Translated Data applied to Cross-Lingual Opinion Mining

342 - Association for Computation Linguistics 2021 مقالة

Following the increasing performance of neural machine translation systems, the paradigm of using automatically translated data for cross-lingual adaptation is now studied in several applicative domains. The capacity to accurately project annotations remains however an issue for sequence tagging tasks where annotation must be projected with correct spans. Additionally, when the task implies noisy user-generated text, the quality of translation and annotation projection can be affected. In this paper we propose to tackle multilingual sequence tagging with a new span alignment method and apply it to opinion target extraction from customer reviews. We show that provided suitable heuristics, translated data with automatic span-level annotation projection can yield improvements both for cross-lingual adaptation compared to zero-shot transfer, and data augmentation compared to a multilingual baseline.

efficient sequence tagging cross-lingual opinion mining translated data applied تسلسل تسلسل فعال التعدين الرأي عبر اللغات البيانات المترجمة تطبيقها صناعة حمض الفوسفور المزيد..

Generalisability of Topic Models in Cross-corpora Abusive Language Detection

430 - Association for Computation Linguistics 2021 مقالة

Rapidly changing social media content calls for robust and generalisable abuse detection models. However, the state-of-the-art supervised models display degraded performance when they are evaluated on abusive comments that differ from the training co rpus. We investigate if the performance of supervised models for cross-corpora abuse detection can be improved by incorporating additional information from topic models, as the latter can infer the latent topic mixtures from unseen samples. In particular, we combine topical information with representations from a model tuned for classifying abusive comments. Our performance analysis reveals that topic models are able to capture abuse-related topics that can transfer across corpora, and result in improved generalisability.

طريقة مبادرة مقرها صناعة حمض الفوسفور

Active Learning for Argument Strength Estimation

402 - Association for Computation Linguistics 2021 مقالة

High-quality arguments are an essential part of decision-making. Automatically predicting the quality of an argument is a complex task that recently got much attention in argument mining. However, the annotation effort for this task is exceptionally high. Therefore, we test uncertainty-based active learning (AL) methods on two popular argument-strength data sets to estimate whether sample-efficient learning can be enabled. Our extensive empirical evaluation shows that uncertainty-based acquisition functions can not surpass the accuracy reached with the random acquisition on these data sets.

argument strength estimation strength estimation argument strength تقدير قوة الوسيطة تقدير القوة قوة الحجة صناعة حمض الفوسفور المزيد..

Event Coreference Data (Almost) for Free: Mining Hyperlinks from Online News

415 - Association for Computation Linguistics 2021 مقالة

Cross-document event coreference resolution (CDCR) is the task of identifying which event mentions refer to the same events throughout a collection of documents. Annotating CDCR data is an arduous and expensive process, explaining why existing corpor a are small and lack domain coverage. To overcome this bottleneck, we automatically extract event coreference data from hyperlinks in online news: When referring to a significant real-world event, writers often add a hyperlink to another article covering this event. We demonstrate that collecting hyperlinks which point to the same article(s) produces extensive and high-quality CDCR data and create a corpus of 2M documents and 2.7M silver-standard event mentions called HyperCoref. We evaluate a state-of-the-art system on three CDCR corpora and find that models trained on small subsets of HyperCoref are highly competitive, with performance similar to models trained on gold-standard data. With our work, we free CDCR research from depending on costly human-annotated training data and open up possibilities for research beyond English CDCR, as our data extraction approach can be easily adapted to other languages.

mining hyperlinks event coreference data تعلب الارتباطات التشعبية بيانات Quiskerence Event. صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Spurious Correlations in Cross-Topic Argument Mining

الارتباطات الزائفة في تعدين الوسيطة عبر الموضوع

Ask ChatGPT about the research

Read More

suggested questions