Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

CIRCE at SemEval-2020 Task 1: Ensembling Context-Free and Context-Dependent Word Representations

74 0 0.0 ( 0 )

Download Cite

Added by Martin P\\\"omsl

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Martin Pomsl

Computation and Language Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper describes the winning contribution to SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection (Subtask 2) handed in by team UG Student Intern. We present an ensemble model that makes predictions based on context-free and context-dependent word representations. The key findings are that (1) context-free word representations are a powerful and robust baseline, (2) a sentence classification objective can be used to obtain useful context-dependent word representations, and (3) combining those representations increases performance on some datasets while decreasing performance on others.

rate research

TransWiC at SemEval-2021 Task 2: Transformer-based Multilingual and Cross-lingual Word-in-Context Disambiguation

72 - Hansi Hettiarachchi , Tharindu Ranasinghe 2021

Identifying whether a word carries the same meaning or different meaning in two contexts is an important research area in natural language processing which plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction. Most of the previous work in this area rely on language-specific resources making it difficult to generalise across languages. Considering this limitation, our approach to SemEval-2021 Task 2 is based only on pretrained transformer models and does not use any language-specific processing and resources. Despite that, our best model achieves 0.90 accuracy for English-English subtask which is very compatible compared to the best result of the subtask; 0.93 accuracy. Our approach also achieves satisfactory results in other monolingual and cross-lingual language pairs as well.

Computation and Language Artificial Intelligence Machine Learning

CS-NET at SemEval-2020 Task 4: Siamese BERT for ComVE

82 - Soumya Ranjan Dash , Sandeep Routray , Prateek Varshney 2020

In this paper, we describe our system for Task 4 of SemEval 2020, which involves differentiating between natural language statements that confirm to common sense and those that do not. The organizers propose three subtasks - first, selecting between two sentences, the one which is against common sense. Second, identifying the most crucial reason why a statement does not make sense. Third, generating novel reasons for explaining the against common sense statement. Out of the three subtasks, this paper reports the system description of subtask A and subtask B. This paper proposes a model based on transformer neural network architecture for addressing the subtasks. The novelty in work lies in the architecture design, which handles the logical implication of contradicting statements and simultaneous information extraction from both sentences. We use a parallel instance of transformers, which is responsible for a boost in the performance. We achieved an accuracy of 94.8% in subtask A and 89% in subtask B on the test set.

Computation and Language Machine Learning

Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for Offensive Language Detection

286 - Wenliang Dai , Tiezheng Yu , Zihan Liu 2020

Nowadays, offensive content in social media has become a serious problem, and automatically detecting offensive language is an essential task. In this paper, we build an offensive language detection system, which combines multi-task learning with BERT-based models. Using a pre-trained language model such as BERT, we can effectively learn the representations for noisy text in social media. Besides, to boost the performance of offensive language detection, we leverage the supervision signals from other related tasks. In the OffensEval-2020 competition, our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place (92.23%F1). An empirical analysis is provided to explain the effectiveness of our approaches.

Computation and Language Machine Learning

PALI at SemEval-2021 Task 2: Fine-Tune XLM-RoBERTa for Word in Context Disambiguation

339 - Shuyi Xie , Jian Ma , Haiqin Yang 2021

This paper presents the PALI teams winning system for SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation. We fine-tune XLM-RoBERTa model to solve the task of word in context disambiguation, i.e., to determine whether the target word in the two contexts contains the same meaning or not. In the implementation, we first specifically design an input tag to emphasize the target word in the contexts. Second, we construct a new vector on the fine-tuned embeddings from XLM-RoBERTa and feed it to a fully-connected network to output the probability of whether the target word in the context has the same meaning or not. The new vector is attained by concatenating the embedding of the [CLS] token and the embeddings of the target word in the contexts. In training, we explore several tricks, such as the Ranger optimizer, data augmentation, and adversarial training, to improve the model prediction. Consequently, we attain first place in all four cross-lingual tasks.

Artificial Intelligence Computation and Language

NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention

141 - Christos Baziotis , Nikos Athanasiou , Georgios Paraskevopoulos 2018

In this paper we present a deep-learning model that competed at SemEval-2018 Task 2 Multilingual Emoji Prediction. We participated in subtask A, in which we are called to predict the most likely associated emoji in English tweets. The proposed architecture relies on a Long Short-Term Memory network, augmented with an attention mechanism, that conditions the weight of each word, on a context vector which is taken as the aggregation of a tweets meaning. Moreover, we initialize the embedding layer of our model, with word2vec word embeddings, pretrained on a dataset of 550 million English tweets. Finally, our model does not rely on hand-crafted features or lexicons and is trained end-to-end with back-propagation. We ranked 2nd out of 48 teams.

Computation and Language

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

CIRCE at SemEval-2020 Task 1: Ensembling Context-Free and Context-Dependent Word Representations

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions