Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

ليست كل السلبيات متساوية: إدراك التسمية الخسارة على نطاق واسع لتصنيف النص المحبب

570 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Fine-grained classification involves dealing with datasets with larger number of classes with subtle differences between them. Guiding the model to focus on differentiating dimensions between these commonly confusable classes is key to improving performance on fine-grained tasks. In this work, we analyse the contrastive fine-tuning of pre-trained language models on two fine-grained text classification tasks, emotion classification and sentiment analysis. We adaptively embed class relationships into a contrastive objective function to help differently weigh the positives and negatives, and in particular, weighting closely confusable negatives more than less similar negative examples. We find that Label-aware Contrastive Loss outperforms previous contrastive methods, in the presence of larger number and/or more confusable classes, and helps models to produce output distributions that are more differentiated.

References used

https://aclanthology.org/

rate research

Not All Comments Are Equal: Insights into Comment Moderation from a Topic-Aware Model

416 - Association for Computation Linguistics 2021 مقالة

Moderation of reader comments is a significant problem for online news platforms. Here, we experiment with models for automatic moderation, using a dataset of comments from a popular Croatian newspaper. Our analysis shows that while comments that vio late the moderation rules mostly share common linguistic and thematic features, their content varies across the different sections of the newspaper. We therefore make our models topic-aware, incorporating semantic features from a topic model into the classification decision. Our results show that topic information improves the performance of the model, increases its confidence in correct outputs, and helps us understand the model's outputs.

equal comments are equal متساوي التعليقات مساوية صناعة حمض الفوسفور

Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing

421 - Association for Computation Linguistics 2021 مقالة

Different linearizations have been proposed to cast dependency parsing as sequence labeling and solve the task as: (i) a head selection problem, (ii) finding a representation of the token arcs as bracket strings, or (iii) associating partial transiti on sequences of a transition-based parser to words. Yet, there is little understanding about how these linearizations behave in low-resource setups. Here, we first study their data efficiency, simulating data-restricted setups from a diverse set of rich-resource treebanks. Second, we test whether such differences manifest in truly low-resource setups. The results show that head selection encodings are more data-efficient and perform better in an ideal (gold) framework, but that such advantage greatly vanishes in favour of bracketing formats when the running setup resembles a real-world low-resource configuration.

sequence labeling parsing equally data-hungry sequence labeling تسلسل وضع العلامات بالتساوي البيانات الجياع تسلسل العلامات صناعة حمض الفوسفور المزيد..

Large-scale text pre-training helps with dialogue act recognition, but not without fine-tuning

391 - Association for Computation Linguistics 2021 مقالة

We use dialogue act recognition (DAR) to investigate how well BERT represents utterances in dialogue, and how fine-tuning and large-scale pre-training contribute to its performance. We find that while both the standard BERT pre-training and pretraini ng on dialogue-like data are useful, task-specific fine-tuning is essential for good performance.

سماء سدور dialogue act قانون الحوار صناعة حمض الفوسفور

CoPHE: A Count-Preserving Hierarchical Evaluation Metric in Large-Scale Multi-Label Text Classification

614 - Association for Computation Linguistics 2021 مقالة

Large-Scale Multi-Label Text Classification (LMTC) includes tasks with hierarchical label spaces, such as automatic assignment of ICD-9 codes to discharge summaries. Performance of models in prior art is evaluated with standard precision, recall, and F1 measures without regard for the rich hierarchical structure. In this work we argue for hierarchical evaluation of the predictions of neural LMTC models. With the example of the ICD-9 ontology we describe a structural issue in the representation of the structured label space in prior art, and propose an alternative representation based on the depth of the ontology. We propose a set of metrics for hierarchical evaluation using the depth-based representation. We compare the evaluation scores from the proposed metrics with previously used metrics on prior art LMTC models for ICD-9 coding in MIMIC-III. We also propose further avenues of research involving the proposed ontological representation.

مهمة اكتشاف الجدة large-scale multi-label text النص متعدد العلامات على نطاق واسع صناعة حمض الفوسفور

Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data

814 - Association for Computation Linguistics 2021 مقالة

Existing text classification methods mainly focus on a fixed label set, whereas many real-world applications require extending to new fine-grained classes as the number of samples per label increases. To accommodate such requirements, we introduce a new problem called coarse-to-fine grained classification, which aims to perform fine-grained classification on coarsely annotated data. Instead of asking for new fine-grained human annotations, we opt to leverage label surface names as the only human guidance and weave in rich pre-trained generative language models into the iterative weak supervision strategy. Specifically, we first propose a label-conditioned fine-tuning formulation to attune these generators for our task. Furthermore, we devise a regularization objective based on the coarse-fine label constraints derived from our problem setting, giving us even further improvements over the prior formulation. Our framework uses the fine-tuned generative models to sample pseudo-training data for training the classifier, and bootstraps on real unlabeled data for model refinement. Extensive experiments and case studies on two real-world datasets demonstrate superior performance over SOTA zero-shot classification baselines.

coarsely-grained annotated data coarsely-grained annotated fine-grained text classification البيانات المشروحة المشجعية مشاحنة خشنة تصنيف النص غرامة الحبيبات صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

ليست كل السلبيات متساوية: إدراك التسمية الخسارة على نطاق واسع لتصنيف النص المحبب

Ask ChatGPT about the research

Read More

suggested questions