مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Revisiting Simple Neural Probabilistic Language Models

71 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Simeng Sun

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Simeng Sun - Mohit Iyyer

الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recent progress in language modeling has been driven not only by advances in neural architectures, but also through hardware and optimization improvements. In this paper, we revisit the neural probabilistic language model (NPLM) of~citet{Bengio2003ANP}, which simply concatenates word embeddings within a fixed window and passes the result through a feed-forward network to predict the next word. When scaled up to modern hardware, this model (despite its many limitations) performs much better than expected on word-level language model benchmarks. Our analysis reveals that the NPLM achieves lower perplexity than a baseline Transformer with short input contexts but struggles to handle long-term dependencies. Inspired by this result, we modify the Transformer by replacing its first self-attention layer with the NPLMs local concatenation layer, which results in small but consistent perplexity decreases across three word-level language modeling datasets.

قيم البحث

اقرأ أيضاً

Neural Lattice Language Models

114 - Jacob Buckman , Graham Neubig 2018

In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models. These models construct a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. This approach allows us to seamlessly incorporate linguistic intuitions - including polysemy and existence of multi-word lexical items - into our language model. Experiments on multiple language modeling tasks show that English neural lattice language models that utilize polysemous embeddings are able to improve perplexity by 9.95% relative to a word-level baseline, and that a Chinese model that handles multi-character tokens is able to improve perplexity by 20.94% relative to a character-level baseline.

الحساب واللغة

Revisiting Simple Domain Adaptation Methods in Unsupervised Neural Machine Translation

148 - Haipeng Sun , Rui Wang , Kehai Chen 2019

Domain adaptation has been well-studied in supervised neural machine translation (SNMT). However, it has not been well-studied for unsupervised neural machine translation (UNMT), although UNMT has recently achieved remarkable results in several domai n-specific language pairs. Besides the inconsistent domains between training data and test data for SNMT, there sometimes exists an inconsistent domain between two monolingual training data for UNMT. In this work, we empirically show different scenarios for unsupervised neural machine translation. Based on these scenarios, we revisit the effect of the existing domain adaptation methods including batch weighting and fine tuning methods in UNMT. Finally, we propose modified methods to improve the performances of domain-specific UNMT systems.

الحساب واللغة

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

164 - Boxi Cao , Hongyu Lin , Xianpei Han 2021

Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this pape r, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases.

الحساب واللغة الذكاء الاصطناعي

Diverse Embedding Neural Network Language Models

166 - Kartik Audhkhasi , Abhinav Sethy , Bhuvana Ramabhadran 2014

We propose Diverse Embedding Neural Network (DENN), a novel architecture for language models (LMs). A DENNLM projects the input word history vector onto multiple diverse low-dimensional sub-spaces instead of a single higher-dimensional sub-space as i n conventional feed-forward neural network LMs. We encourage these sub-spaces to be diverse during network training through an augmented loss function. Our language modeling experiments on the Penn Treebank data set show the performance benefit of using a DENNLM.

الحساب واللغة التعلم الآلي الحوسبة العصبية والتطورية

Evaluating Saliency Methods for Neural Language Models

257 - Shuoyang Ding , Philipp Koehn 2021

Saliency methods are widely used to interpret neural network predictions, but different variants of saliency methods often disagree even on the interpretations of the same prediction made by the same model. In these cases, how do we identify when are these interpretations trustworthy enough to be used in analyses? To address this question, we conduct a comprehensive and quantitative evaluation of saliency methods on a fundamental category of NLP models: neural language models. We evaluate the quality of prediction interpretations from two perspectives that each represents a desirable property of these interpretations: plausibility and faithfulness. Our evaluation is conducted on four different datasets constructed from the existing human annotation of syntactic and semantic agreements, on both sentence-level and document-level. Through our evaluation, we identified various ways saliency methods could yield interpretations of low quality. We recommend that future work deploying such methods to neural language models should carefully validate their interpretations before drawing insights.

الحساب واللغة

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد العالي لإدارة الأعمال

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Revisiting Simple Neural Probabilistic Language Models

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً