Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

What is SemEval evaluating? A Systematic Analysis of Evaluation Campaigns in NLP

ما هو تقييم Semeval؟تحليل منهجي لحملات التقييم في NLP

504 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

SemEval is the primary venue in the NLP community for the proposal of new challenges and for the systematic empirical evaluation of NLP systems. This paper provides a systematic quantitative analysis of SemEval aiming to evidence the patterns of the contributions behind SemEval. By understanding the distribution of task types, metrics, architectures, participation and citations over time we aim to answer the question on what is being evaluated by SemEval.

References used

https://aclanthology.org/

rate research

What is Multimodality?

556 - Association for Computation Linguistics 2021 مقالة

The last years have shown rapid developments in the field of multimodal machine learning, combining e.g., vision, text or speech. In this position paper we explain how the field uses outdated definitions of multimodality that prove unfit for the mach ine learning era. We propose a new task-relative definition of (multi)modality in the context of multimodal machine learning that focuses on representations and information that are relevant for a given machine learning task. With our new definition of multimodality we aim to provide a missing foundation for multimodal research, an important component of language grounding and a crucial milestone towards NLU.

تجارب التكيف multimodal machine learning آلة التعلم متعددة الوسائط آلة صناعة حمض الفوسفور

What is on Social Media that is not in WordNet? A Preliminary Analysis on the TwitterAAE Corpus

472 - Association for Computation Linguistics 2021 مقالة

Natural Language Processing tools and resources have been so far mainly created and trained for standard varieties of language. Nowadays, with the use of large amounts of data gathered from social media, other varieties and registers need to be proce ssed, which may present other challenges and difficulties. In this work, we focus on English and we present a preliminary analysis by comparing the TwitterAAE corpus, which is annotated for ethnicity, and WordNet by quantifying and explaining the online language that WordNet misses.

نظرية هيكل الوثائق عبر المستندات preliminary analysis twitteraae corpus تحليل أولي Twitteraae Corpus. صناعة حمض الفوسفور

Apples to Apples: A Systematic Evaluation of Topic Models

487 - Association for Computation Linguistics 2021 مقالة

From statistical to neural models, a wide variety of topic modelling algorithms have been proposed in the literature. However, because of the diversity of datasets and metrics, there have not been many efforts to systematically compare their performa nce on the same benchmarks and under the same conditions. In this paper, we present a selection of 9 topic modelling techniques from the state of the art reflecting a diversity of approaches to the task, an overview of the different metrics used to compare their performance, and the challenges of conducting such a comparison. We empirically evaluate the performance of these models on different settings reflecting a variety of real-life conditions in terms of dataset size, number of topics, and distribution of topics, following identical preprocessing and evaluation processes. Using both metrics that rely on the intrinsic characteristics of the dataset (different coherence metrics), as well as external knowledge (word embeddings and ground-truth topic labels), our experiments reveal several shortcomings regarding the common practices in topic models evaluation.

systematic evaluation topic models evaluation apples to apples التقييم المنهجي تقييم نماذج الموضوع التفاح للتفاح صناعة حمض الفوسفور المزيد..

S-NLP at SemEval-2021 Task 5: An Analysis of Dual Networks for Sequence Tagging

425 - Association for Computation Linguistics 2021 مقالة

The SemEval 2021 task 5: Toxic Spans Detection is a task of identifying considered-toxic spans in text, which provides a valuable, automatic tool for moderating online contents. This paper represents the second-place method for the task, an ensemble of two approaches. While one approach relies on combining different embedding methods to extract diverse semantic and syntactic representations of words in context; the other utilizes extra data with a slightly customized Self-training, a semi-supervised learning technique, for sequence tagging problems. Both of our architectures take advantage of a strong language model, which was fine-tuned on a toxic classification task. Although experimental evidence indicates higher effectiveness of the first approach than the second one, combining them leads to our best results of 70.77 F1-score on the test dataset.

analysis of dual dual networks networks for sequence تحليل المزدوج الشبكات المزدوجة الشبكات للتسلسل صناعة حمض الفوسفور المزيد..

Applying Occam's Razor to Transformer-Based Dependency Parsing: What Works, What Doesn't, and What is Really Necessary

333 - Association for Computation Linguistics 2021 مقالة

The introduction of pre-trained transformer-based contextualized word embeddings has led to considerable improvements in the accuracy of graph-based parsers for frameworks such as Universal Dependencies (UD). However, previous works differ in various dimensions, including their choice of pre-trained language models and whether they use LSTM layers. With the aims of disentangling the effects of these choices and identifying a simple yet widely applicable architecture, we introduce STEPS, a new modular graph-based dependency parser. Using STEPS, we perform a series of analyses on the UD corpora of a diverse set of languages. We find that the choice of pre-trained embeddings has by far the greatest impact on parser performance and identify XLM-R as a robust choice across the languages in our study. Adding LSTM layers provides no benefits when using transformer-based embeddings. A multi-task training setup outputting additional UD features may contort results. Taking these insights together, we propose a simple but widely applicable parser architecture and configuration, achieving new state-of-the-art results (in terms of LAS) for 10 out of 12 diverse languages.

applying occam razor occam razor transformer-based dependency parsing تطبيق Occam Razor. accam الحلاقة تحليل التبعية القائمة على المحولات صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

What is SemEval evaluating? A Systematic Analysis of Evaluation Campaigns in NLP

ما هو تقييم Semeval؟تحليل منهجي لحملات التقييم في NLP

Ask ChatGPT about the research

Read More

suggested questions