New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Do Long-Range Language Models Actually Use Long-Range Context?

هل هناك نماذج اللغة الطويلة المدى تستخدم في الواقع سياق طويل المدى؟

340 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

يتم تدريب نماذج اللغة بشكل عام على تسلسل المدخلات القصيرة والمتقطعة، والتي تحد من قدرتها على استخدام معلومات مستوى الخطاب الموجودة في سياق طويل المدى لتحسين تنبؤاتها. أدت الجهود الأخيرة لتحسين كفاءة اهتمام الذات إلى انتشار نماذج لغة محول طويلة المدى، والتي يمكن أن تعالج تسلسل أطول بكثير من نماذج الماضي. ومع ذلك، تبقى الطرق التي تستفيد منها هذه النماذج من السياق الطويل المدى غير واضح. في هذه الورقة، نقوم بإجراء تحليل جيد الحبيبات من طرازات لغة محول طويلة المدى (بما في ذلك محول التوجيه، والذي يحقق حيرة من الفن الحيرة على مجموعة بيانات BG-19 المتسلسلة LM Transmark) التي تقبل المدخلات تسلسل يصل إلى 8K الرموز. نتائجنا تكشف عن توفير سياق طويل المدى (أي، خارج الرموز 2K السابقة) لهذه النماذج يحسن فقط تنبؤاتها على مجموعة صغيرة من الرموز (على سبيل المثال، تلك التي يمكن نسخها من السياق البعيد) ولا يساعد على الإطلاق لمهام التنبؤ على مستوى الجملة. أخيرا، نكتشف أن PG-19 تحتوي على مجموعة متنوعة من أنواع المستندات والمجالات المختلفة، وأن السياق الطويل المدى يساعد معظمها على الروايات الأدبية (بدلا من الكتب المدرسية أو المجلات).

Language models are generally trained on short, truncated input sequences, which limits their ability to use discourse-level information present in long-range context to improve their predictions. Recent efforts to improve the efficiency of self-attention have led to a proliferation of long-range Transformer language models, which can process much longer sequences than models of the past. However, the ways in which such models take advantage of the long-range context remain unclear. In this paper, we perform a fine-grained analysis of two long-range Transformer language models (including the Routing Transformer, which achieves state-of-the-art perplexity on the PG-19 long-sequence LM benchmark dataset) that accept input sequences of up to 8K tokens. Our results reveal that providing long-range context (i.e., beyond the previous 2K tokens) to these models only improves their predictions on a small set of tokens (e.g., those that can be copied from the distant context) and does not help at all for sentence-level prediction tasks. Finally, we discover that PG-19 contains a variety of different document types and domains, and that long-range context helps most for literary novels (as opposed to textbooks or magazines).

References used

https://aclanthology.org/

rate research

Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy

416 - Association for Computation Linguistics 2021 مقالة

Statistical language modeling and translation with transformers have found many successful applications in program understanding and generation tasks, setting high benchmarks for tools in modern software development environments. The finite context w indow of these neural models means, however, that they will be unable to leverage the entire relevant context of large files and packages for any given task. While there are many efforts to extend the context window, we introduce an architecture-independent approach for leveraging the syntactic hierarchies of source code for incorporating entire file-level context into a fixed-length window. Using concrete syntax trees of each source file we extract syntactic hierarchies and integrate them into context window by selectively removing from view more specific, less relevant scopes for a given task. We evaluate this approach on code generation tasks and joint translation of natural language and source code in Python programming language, achieving a new state-of-the-art in code completion and summarization for Python in the CodeXGLUE benchmark. We also introduce new CodeXGLUE benchmarks for user-experience-motivated tasks: code completion with normalized literals, method body completion/code summarization conditioned on file-level context.

extended window access syntax hierarchy extended window إمكانية الوصول إلى النافذة الممتدة بناء جملة الهرمية نافذة ممتدة صناعة حمض الفوسفور المزيد..

Progressive Generation of Long Text with Pretrained Language Models

451 - Association for Computation Linguistics 2021 مقالة

Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators. However, as our systematic examination reveals, it is still challenging for such models to generate coherent long passag es of text (e.g., 1000 tokens), especially when the models are fine-tuned to the target domain on a small corpus. Previous planning-then-generation methods also fall short of producing such long text in various domains. To overcome the limitations, we propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution. Our method first produces domain-specific content keywords and then progressively refines them into complete passages in multiple stages. The simple design allows our approach to take advantage of pretrained LMs at each stage and effectively adapt to any target domain given only a small set of examples. We conduct a comprehensive empirical study with a broad set of evaluation metrics, and show that our approach significantly improves upon the fine-tuned large LMs and various planning-then-generation methods in terms of quality and sample efficiency. Human evaluation also validates that our model generations are more coherent.

تحسين التوضيح large-scale language models نماذج لغة واسعة النطاق صناعة حمض الفوسفور

A Study of QoS in Long Term Evolution Networks "LTE"

3037 - Aِl-Baath University 2017 ورقة بحثية

Long Term Evolution “LTE” is considered to be one of the most important and latest communication technologies falling under the fourth generation of cellular communications technology 4G. LTE supports high-speed and large bandwidth which makes it a great candidate to providing the potential to improve the Quality of Service "QoS" associated with specific types of data transfer. As a consequence, researchers have paid their attentions to this type of networks. In fact, it was a great challenge for researchers to achieve a good level of QoS for all users as the LTE provides Audio and Data transmission to users at the same time.

Throughput Quality of Service جودة الخدمة خوارزميات الجدولة scheduling algorithms LTE الاتصالات الخلوية TCP Delay Jitter NS3 Mobile Communication المزيد..

Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories

348 - Association for Computation Linguistics 2021 مقالة

Measuring event salience is essential in the understanding of stories. This paper takes a recent unsupervised method for salience detection derived from Barthes Cardinal Functions and theories of surprise and applies it to longer narrative forms. We improve the standard transformer language model by incorporating an external knowledgebase (derived from Retrieval Augmented Generation) and adding a memory mechanism to enhance performance on longer works. We use a novel approach to derive salience annotation using chapter-aligned summaries from the Shmoop corpus for classic literary works. Our evaluation against this data demonstrates that our salience detection model improves performance over and above a non-knowledgebase and memory augmented language model, both of which are crucial to this improvement.

knowledge augmented language knowledge augmented inferring salience المعرفة اللغة المعززة المعرفة المعزز استنتاج الصلبة صناعة حمض الفوسفور المزيد..

Improved Text Classification of Long-term Care Materials

273 - Association for Computation Linguistics 2021 مقالة

Aging populations have posed a challenge to many countries including Taiwan, and with them come the issue of long-term care. Given the current context, the aim of this study was to explore the hotly-discussed subtopics in the field of long-term care, and identify its features through NLP. This study applied TF-IDF, the Logistic Regression model, and the Naive Bayes classifier to process data. In sum, the results showed that it reached a best F1-score of 0.920 in identification, and a best accuracy of 0.708 in classification. The results of this study could be used as a reference for future long-term care related applications.

long-term care materials improved text classification improved text مواد الرعاية طويلة الأجل تحسين تصنيف النص تحسين النص صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Do Long-Range Language Models Actually Use Long-Range Context?

هل هناك نماذج اللغة الطويلة المدى تستخدم في الواقع سياق طويل المدى؟

Ask ChatGPT about the research

Read More

suggested questions