New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Watching a Language Model Learning Chess

مشاهدة نموذج اللغة تعلم الشطرنج

277 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We analyse how a transformer-based language model learns the rules of chess from text data of recorded games. We show how it is possible to investigate how the model capacity and the available number of training data influence the learning success of a language model with the help of chess-specific metrics. With these metrics, we show that more games used for training in the studied range offers significantly better results for the same training time. However, model size does not show such a clear influence. It is also interesting to observe that the usual evaluation metrics for language models, predictive accuracy and perplexity, give no indication of this here. Further examination of trained models reveals how they store information about board state in the activations of neuron groups, and how the overall sequence of previous moves influences the newly-generated moves.

References used

https://aclanthology.org/

rate research

Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data

789 - Association for Computation Linguistics 2021 مقالة

Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language m odels such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.

learning arabic language arabic natural language تعلم اللغة العربية اللغة العربية الطبيعية صناعة حمض الفوسفور

Towards a Language Model for Temporal Commonsense Reasoning

508 - Association for Computation Linguistics 2021 مقالة

Temporal commonsense reasoning is a challenging task as it requires temporal knowledge usually not explicit in text. In this work, we propose an ensemble model for temporal commonsense reasoning. Our model relies on pre-trained contextual representat ions from transformer-based language models (i.e., BERT), and on a variety of training methods for enhancing model generalization: 1) multi-step fine-tuning using carefully selected auxiliary tasks and datasets, and 2) a specifically designed temporal masked language model task aimed to capture temporal commonsense knowledge. Our model greatly outperforms the standard fine-tuning approach and strong baselines on the MC-TACO dataset.

temporal commonsense reasoning commonsense reasoning temporal commonsense المنطق الزمني المنطقي المنطق المنطقي العمولة الزمنية صناعة حمض الفوسفور المزيد..

Language Resource Efficient Learning for Captioning

278 - Association for Computation Linguistics 2021 مقالة

Due to complex cognitive and inferential efforts involved in the manual generation of one caption per image/video input, the human annotation resources are very limited for captioning tasks. We define language resource efficient as reaching the same performance with fewer annotated captions per input. We first study the performance degradation of caption models in different language resource settings. Our analysis of caption models with SC loss shows that the performance degradation is caused by the increasingly noisy estimation of reward and baseline with fewer language resources. To mitigate this issue, we propose to reduce the variance of noise in the baseline by generalizing the single pairwise comparison in SC loss and using multiple generalized pairwise comparisons. The generalized pairwise comparison (GPC) measures the difference between the evaluation scores of two captions with respect to an input. Empirically, we show that the model trained with the proposed GPC loss is efficient on language resource and achieves similar performance with the state-of-the-art models on MSCOCO by using only half of the language resources. Furthermore, our model significantly outperforms the state-of-the-art models on a video caption dataset that has only one labeled caption per input in the training set.

resource efficient learning learning for captioning language resource efficient التعلم كفاءة الموارد تعلم التسمية التوضيحية موارد اللغة كفاءة صناعة حمض الفوسفور المزيد..

Language Model Pretraining and Transfer Learning for Very Low Resource Languages

332 - Association for Computation Linguistics 2021 مقالة

This paper describes our submission for the shared task on Unsupervised MT and Very Low Resource Supervised MT at WMT 2021. We submitted systems for two language pairs: German ↔ Upper Sorbian (de ↔ hsb) and German-Lower Sorbian (de ↔ dsb). For de ↔ h sb, we pretrain our system using MASS (Masked Sequence to Sequence) objective and then finetune using iterative back-translation. Final finetunng is performed using the parallel data provided for translation objective. For de ↔ dsb, no parallel data is provided in the task, we use final de ↔ hsb model as initialization of the de ↔ dsb model and train it further using iterative back-translation, using the same vocabulary as used in the de ↔ hsb model.

السامية العليا low resource languages pretraining and transfer لغات الموارد المنخفضة محاكاة ونقل صناعة حمض الفوسفور

What Can a Generative Language Model Answer About a Passage?

585 - Association for Computation Linguistics 2021 مقالة

Generative language models trained on large, diverse corpora can answer questions about a passage by generating the most likely continuation of the passage followed by a question/answer pair. However, accuracy rates vary depending on the type of ques tion asked. In this paper we keep the passage fixed, and test with a wide variety of question types, exploring the strengths and weaknesses of the GPT-3 language model. We provide the passage and test questions as a challenge set for other language models.

generative language model language model answer generative language نموذج اللغة التوليدية إجابة نموذج اللغة لغة توليدية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Watching a Language Model Learning Chess

مشاهدة نموذج اللغة تعلم الشطرنج

Ask ChatGPT about the research

Read More

suggested questions