Research papers, master and doctoral theses about Specific

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

813 - Association for Computation Linguistics 2021 مقالة

Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (ada pters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

simple heuristics generalization in nli learning dataset-specific heuristics الاستدلال البسيطة التعميم في NLI. التعلم الاستدلال محددات البيانات صناعة حمض الفوسفور المزيد..

A Thorough Evaluation of Task-Specific Pretraining for Summarization

472 - Association for Computation Linguistics 2021 مقالة

Task-agnostic pretraining objectives like masked language models or corrupted span prediction are applicable to a wide range of NLP downstream tasks (Raffel et al.,2019), but are outperformed by task-specific pretraining objectives like predicting ex tracted gap sentences on summarization (Zhang et al.,2020). We compare three summarization specific pretraining objectives with the task agnostic corrupted span prediction pretraining in controlled study. We also extend our study to a low resource and zero shot setup, to understand how many training examples are needed in order to ablate the task-specific pretraining without quality loss. Our results show that task-agnostic pretraining is sufficient for most cases which hopefully reduces the need for costly task-specific pretraining. We also report new state-of-the-art number for two summarization task using a T5 model with 11 billion parameters and an optimal beam search length penalty.

task-specific pretraining pretraining objectives احتجاج المهام الخاصة أهداف الاحتجاج صناعة حمض الفوسفور

TEBNER: Domain Specific Named Entity Recognition with Type Expanded Boundary-aware Network

571 - Association for Computation Linguistics 2021 مقالة

To alleviate label scarcity in Named Entity Recognition (NER) task, distantly supervised NER methods are widely applied to automatically label data and identify entities. Although the human effort is reduced, the generated incomplete and noisy annota tions pose new challenges for learning effective neural models. In this paper, we propose a novel dictionary extension method which extracts new entities through the type expanded model. Moreover, we design a multi-granularity boundary-aware network which detects entity boundaries from both local and global perspectives. We conduct experiments on different types of datasets, the results show that our model outperforms previous state-of-the-art distantly supervised systems and even surpasses the supervised models.

domain specific named specific named entity المجال المحدد اسمه كيان اسمي محدد صناعة حمض الفوسفور

Domain-Specific Japanese ELECTRA Model Using a Small Corpus

622 - Association for Computation Linguistics 2021 مقالة

Recently, domain shift, which affects accuracy due to differences in data between source and target domains, has become a serious issue when using machine learning methods to solve natural language processing tasks. With additional pretraining and fi ne-tuning using a target domain corpus, pretraining models such as BERT (Bidirectional Encoder Representations from Transformers) can address this issue. However, the additional pretraining of the BERT model is difficult because it requires significant computing resources. The efficiently learning an encoder that classifies token replacements accurately (ELECTRA) pretraining model replaces the BERT pretraining method's masked language modeling with a method called replaced token detection, which improves the computational efficiency and allows the additional pretraining of the model to a practical extent. Herein, we propose a method for addressing the computational efficiency of pretraining models in domain shift by constructing an ELECTRA pretraining model on a Japanese dataset and additional pretraining this model in a downstream task using a corpus from the target domain. We constructed a pretraining model for ELECTRA in Japanese and conducted experiments on a document classification task using data from Japanese news articles. Results show that even a model smaller than the pretrained model performs equally well.

domain-specific japanese electra small corpus bidirectional encoder representations إلكترا اليابانية خاصة بالمجال كوربوس صغيرة تمثيل التشفير ثنائي الاتجاه صناعة حمض الفوسفور المزيد..

Domain Adaptation for Hindi-Telugu Machine Translation Using Domain Specific Back Translation

680 - Association for Computation Linguistics 2021 مقالة

In this paper, we present a novel approachfor domain adaptation in Neural MachineTranslation which aims to improve thetranslation quality over a new domain.Adapting new domains is a highly challeng-ing task for Neural Machine Translation onlimited da ta, it becomes even more diffi-cult for technical domains such as Chem-istry and Artificial Intelligence due to spe-cific terminology, etc. We propose DomainSpecific Back Translation method whichuses available monolingual data and gen-erates synthetic data in a different way.This approach uses Out Of Domain words.The approach is very generic and can beapplied to any language pair for any domain. We conduct our experiments onChemistry and Artificial Intelligence do-mains for Hindi and Telugu in both direc-tions. It has been observed that the usageof synthetic data created by the proposedalgorithm improves the BLEU scores significantly.

specific back translation domain specific back hindi-telugu machine translation ترجمة محددة المجال الخاص مرة أخرى ترجمة آلة الهندية التيلجو صناعة حمض الفوسفور المزيد..

Exploiting Domain-Specific Knowledge for Judgment Prediction Is No Panacea

561 - Association for Computation Linguistics 2021 مقالة

Legal judgment prediction (LJP) usually consists in a text classification task aimed at predicting the verdict on the basis of the fact description. The literature shows that the use of articles as input features helps improve the classification perf ormance. In this work, we designed a verdict prediction task based on landlord-tenant disputes and we applied BERT-based models to which we fed different article-based features. Although the results obtained are consistent with the literature, the improvements with the articles are mostly obtained with the most frequent labels, suggesting that pre-trained and fine-tuned transformer-based models are not scalable as is for legal reasoning in real life scenarios as they would only excel in accurately predicting the most recurrent verdicts to the detriment of other legal outcomes.

exploiting domain-specific knowledge domain-specific knowledge legal judgment prediction استغلال المعرفة الخاصة بالمجال المعرفة الخاصة بالمجال التنبؤ الحكم القانوني صناعة حمض الفوسفور المزيد..

C3SL at SemEval-2021 Task 1: Predicting Lexical Complexity of Words in Specific Contexts with Sentence Embeddings

628 - Association for Computation Linguistics 2021 مقالة

We present our approach to predicting lexical complexity of words in specific contexts, as entered LCP Shared Task 1 at SemEval 2021. The approach consists of separating sentences into smaller chunks, embedding them with Sent2Vec, and reducing the em beddings into a simpler vector used as input to a neural network, the latter for predicting the complexity of words and expressions. Results show that the pre-trained sentence embeddings are not able to capture lexical complexity from the language when applied in cross-domain applications.

lcp shared task specific contexts المهمة المشتركة LCP سياقات محددة صناعة حمض الفوسفور

Schema-Guided Paradigm for Zero-Shot Dialog

602 - Association for Computation Linguistics 2021 مقالة

Developing mechanisms that flexibly adapt dialog systems to unseen tasks and domains is a major challenge in dialog research. Neural models implicitly memorize task-specific dialog policies from the training data. We posit that this implicit memoriza tion has precluded zero-shot transfer learning. To this end, we leverage the schema-guided paradigm, wherein the task-specific dialog policy is explicitly provided to the model. We introduce the Schema Attention Model (SAM) and improved schema representations for the STAR corpus. SAM obtains significant improvement in zero-shot settings, with a +22 F1 score improvement over prior work. These results validate the feasibility of zero-shot generalizability in dialog. Ablation experiments are also presented to demonstrate the efficacy of SAM.

dialog schema-guided paradigm task-specific dialog النموذج الموجه المخطط مربع حوار خاص صناعة حمض الفوسفور

Modality-specific Distillation

282 - Association for Computation Linguistics 2021 مقالة

Large neural networks are impractical to deploy on mobile devices due to their heavy computational cost and slow inference. Knowledge distillation (KD) is a technique to reduce the model size while retaining performance by transferring knowledge from a large teacher'' model to a smaller student'' model. However, KD on multimodal datasets such as vision-language datasets is relatively unexplored and digesting such multimodal information is challenging since different modalities present different types of information. In this paper, we propose modality-specific distillation (MSD) to effectively transfer knowledge from a teacher on multimodal datasets. Existing KD approaches can be applied to multimodal setup, but a student doesn't have access to modality-specific predictions. Our idea aims at mimicking a teacher's modality-specific predictions by introducing an auxiliary loss term for each modality. Because each modality has different importance for predictions, we also propose weighting approaches for the auxiliary losses; a meta-learning approach to learn the optimal weights on these loss terms. In our experiments, we demonstrate the effectiveness of our MSD and the weighting scheme and show that it achieves better performance than KD.

modality-specific distillation التقطير الخاص بالطريقة صناعة حمض الفوسفور

EstBERT: A Pretrained Language-Specific BERT for Estonian

574 - Association for Computation Linguistics 2021 مقالة

This paper presents EstBERT, a large pretrained transformer-based language-specific BERT model for Estonian. Recent work has evaluated multilingual BERT models on Estonian tasks and found them to outperform the baselines. Still, based on existing stu dies on other languages, a language-specific BERT model is expected to improve over the multilingual ones. We first describe the EstBERT pretraining process and then present the models' results based on the finetuned EstBERT for multiple NLP tasks, including POS and morphological tagging, dependency parsing, named entity recognition and text classification. The evaluation results show that the models based on EstBERT outperform multilingual BERT models on five tasks out of seven, providing further evidence towards a view that training language-specific BERT models are still useful, even when multilingual models are available.

language-specific bert pretrained language-specific bert language-specific bert model بيرت محددة باللغة برت لغة محددة مسبقا لغة بيرت الخاصة باللغة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد