New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Word Discriminations for Vocabulary Inventory Prediction

تمييز الكلمات لتنبؤ المخزون المفردات

266 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

vocabulary inventory prediction inventory prediction vocabulary inventory التنبؤ بمفردات المخزون التنبؤ الجرد مخزون المفردات صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The aim of vocabulary inventory prediction is to predict a learner's whole vocabulary based on a limited sample of query words. This paper approaches the problem starting from the 2-parameter Item Response Theory (IRT) model, giving each word in the vocabulary a difficulty and discrimination parameter. The discrimination parameter is evaluated on the sub-problem of question item selection, familiar from the fields of Computerised Adaptive Testing (CAT) and active learning. Next, the effect of the discrimination parameter on prediction performance is examined, both in a binary classification setting, and in an information retrieval setting. Performance is compared with baselines based on word frequency. A number of different generalisation scenarios are examined, including generalising word difficulty and discrimination using word embeddings with a predictor network and testing on out-of-dataset data.

References used

https://aclanthology.org/

rate research

AVocaDo: Strategy for Adapting Vocabulary to Downstream Domain

279 - Association for Computation Linguistics 2021 مقالة

During the fine-tuning phase of transfer learning, the pretrained vocabulary remains unchanged, while model parameters are updated. The vocabulary generated based on the pretrained data is suboptimal for downstream data when domain discrepancy exists . We propose to consider the vocabulary as an optimizable parameter, allowing us to update the vocabulary by expanding it with domain specific vocabulary based on a tokenization statistic. Furthermore, we preserve the embeddings of the added words from overfitting to downstream data by utilizing knowledge learned from a pretrained language model with a regularization term. Our method achieved consistent performance improvements on diverse domains (i.e., biomedical, computer science, news, and reviews).

strategy for adapting adapting vocabulary vocabulary استراتيجية التكيف تكييف المفردات صناعة حمض الفوسفور

Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution

362 - Association for Computation Linguistics 2021 مقالة

We propose a straightforward vocabulary adaptation scheme to extend the language capacity of multilingual machine translation models, paving the way towards efficient continual learning for multilingual machine translation. Our approach is suitable f or large-scale datasets, applies to distant languages with unseen scripts, incurs only minor degradation on the translation performance for the original language pairs and provides competitive performance even in the case where we only possess monolingual data for the new languages.

حالة غريبة صناعة حمض الفوسفور

ReCAM@IITK at SemEval-2021 Task 4: BERT and ALBERT based Ensemble for Abstract Word Prediction

376 - Association for Computation Linguistics 2021 مقالة

This paper describes our system for Task 4 of SemEval-2021: Reading Comprehension of Abstract Meaning (ReCAM). We participated in all subtasks where the main goal was to predict an abstract word missing from a statement. We fine-tuned the pre-trained masked language models namely BERT and ALBERT and used an Ensemble of these as our submitted system on Subtask 1 (ReCAM-Imperceptibility) and Subtask 2 (ReCAM-Nonspecificity). For Subtask 3 (ReCAM-Intersection), we submitted the ALBERT model as it gives the best results. We tried multiple approaches and found that Masked Language Modeling(MLM) based approach works the best.

abstract word prediction word prediction abstract word نبض كلمة مجردة كلمة التنبؤ كلمة مجردة صناعة حمض الفوسفور المزيد..

jurBERT: A Romanian BERT Model for Legal Judgement Prediction

348 - Association for Computation Linguistics 2021 مقالة

Transformer-based models have become the de facto standard in the field of Natural Language Processing (NLP). By leveraging large unlabeled text corpora, they enable efficient transfer learning leading to state-of-the-art results on numerous NLP task s. Nevertheless, for low resource languages and highly specialized tasks, transformer models tend to lag behind more classical approaches (e.g. SVM, LSTM) due to the lack of aforementioned corpora. In this paper we focus on the legal domain and we introduce a Romanian BERT model pre-trained on a large specialized corpus. Our model outperforms several strong baselines for legal judgement prediction on two different corpora consisting of cases from trials involving banks in Romania.

romanian bert model الرومانية بيرت نموذج صناعة حمض الفوسفور

A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

301 - Association for Computation Linguistics 2021 مقالة

There is an emerging interest in the application of natural language processing models to source code processing tasks. One of the major problems in applying deep learning to software engineering is that source code often contains a lot of rare ident ifiers, resulting in huge vocabularies. We propose a simple, yet effective method, based on identifier anonymization, to handle out-of-vocabulary (OOV) identifiers. Our method can be treated as a preprocessing step and, therefore, allows for easy implementation. We show that the proposed OOV anonymization method significantly improves the performance of the Transformer in two code processing tasks: code completion and bug fixing.

approach for handling simple approach source code نهج التعامل نهج بسيط مصدر الرمز صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Word Discriminations for Vocabulary Inventory Prediction

تمييز الكلمات لتنبؤ المخزون المفردات

Ask ChatGPT about the research

Read More

suggested questions