Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Explore Better Relative Position Embeddings from Encoding Perspective for Transformer Models

استكشاف أفضل تضمين المنافذ النسبي من منظور الترميز لنماذج المحولات

327 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

طرازات المحولات relative position embeddings perspective for transformer Embeddings الموقف النسبي منظور المحول صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Relative position embedding (RPE) is a successful method to explicitly and efficaciously encode position information into Transformer models. In this paper, we investigate the potential problems in Shaw-RPE and XL-RPE, which are the most representative and prevalent RPEs, and propose two novel RPEs called Low-level Fine-grained High-level Coarse-grained (LFHC) RPE and Gaussian Cumulative Distribution Function (GCDF) RPE. LFHC-RPE is an improvement of Shaw-RPE, which enhances the perception ability at medium and long relative positions. GCDF-RPE utilizes the excellent properties of the Gaussian function to amend the prior encoding mechanism in XL-RPE. Experimental results on nine authoritative datasets demonstrate the effectiveness of our methods empirically. Furthermore, GCDF-RPE achieves the best overall performance among five different RPEs.

References used

https://aclanthology.org/

rate research

Modeling Graph Structure via Relative Position for Text Generation from Knowledge Graphs

383 - Association for Computation Linguistics 2021 مقالة

We present Graformer, a novel Transformer-based encoder-decoder architecture for graph-to-text generation. With our novel graph self-attention, the encoding of a node relies on all nodes in the input graph - not only direct neighbors - facilitating t he detection of global patterns. We represent the relation between two nodes as the length of the shortest path between them. Graformer learns to weight these node-node relations differently for different attention heads, thus virtually learning differently connected views of the input graph. We evaluate Graformer on two popular graph-to-text generation benchmarks, AGENDA and WebNLG, where it achieves strong performance while using many fewer parameters than other approaches.

modeling graph structure structure via relative relative position هيكل الرسم البياني النمذجة هيكل عبر قريب الوضع النسبي صناعة حمض الفوسفور المزيد..

Exploring Structural Encoding for Data-to-Text Generation

440 - Association for Computation Linguistics 2021 مقالة

Due to efficient end-to-end training and fluency in generated texts, several encoder-decoder framework-based models are recently proposed for data-to-text generations. Appropriate encoding of input data is a crucial part of such encoder-decoder model s. However, only a few research works have concentrated on proper encoding methods. This paper presents a novel encoder-decoder based data-to-text generation model where the proposed encoder carefully encodes input data according to underlying structure of the data. The effectiveness of the proposed encoder is evaluated both extrinsically and intrinsically by shuffling input data without changing meaning of that data. For selecting appropriate content information in encoded data from encoder, the proposed model incorporates attention gates in the decoder. With extensive experiments on WikiBio and E2E dataset, we show that our model outperforms the state-of-the models and several standard baseline systems. Analysis of the model through component ablation tests and human evaluation endorse the proposed model as a well-grounded system.

exploring structural encoding exploring structural structural encoding استكشاف الترميز الهيكلية استكشاف الهيكلية الترميز الهيكلية صناعة حمض الفوسفور المزيد..

Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental NLU

412 - Association for Computation Linguistics 2021 مقالة

Incremental processing allows interactive systems to respond based on partial inputs, which is a desirable property e.g. in dialogue agents. The currently popular Transformer architecture inherently processes sequences as a whole, abstracting away th e notion of time. Recent work attempts to apply Transformers incrementally via restart-incrementality by repeatedly feeding, to an unchanged model, increasingly longer input prefixes to produce partial outputs. However, this approach is computationally costly and does not scale efficiently for long sequences. In parallel, we witness efforts to make Transformers more efficient, e.g. the Linear Transformer (LT) with a recurrence mechanism. In this work, we examine the feasibility of LT for incremental NLU in English. Our results show that the recurrent LT model has better incremental performance and faster inference speed compared to the standard Transformer and LT with restart-incrementality, at the cost of part of the non-incremental (full sequence) quality. We show that the performance drop can be mitigated by training the model to wait for right context before committing to an output and that training with input prefixes is beneficial for delivering correct partial outputs.

empirical analysis incremental nlu التحليل التجريبي nlu التزايدي صناعة حمض الفوسفور

Machine Translation Post-Editing (MTPE) from the Perspective of Translation Trainees: Implications for Translation Pedagogy

469 - Association for Computation Linguistics 2021 مقالة

This paper introduces data on translation trainees' perceptions of the MTPE process and implications on training in this field. This study aims to analyse trainees' performance of three MTPE tasks the English-Polish language pair and post-tasks inter views to determine the need to promote machine translation post-editing skills in educating translation students. Since very little information concerning MTPE training is available, this study may be found advantageous.

translation trainees translation pedagogy machine translation post-editing المتدربين الترجمة ترياج التربية الترجمة الآلية بعد التحرير صناعة حمض الفوسفور المزيد..

Transformer with Syntactic Position Encoding for Machine Translation

351 - Association for Computation Linguistics 2021 مقالة

It has been widely recognized that syntax information can help end-to-end neural machine translation (NMT) systems to achieve better translation. In order to integrate dependency information into Transformer based NMT, existing approaches either expl oit words' local head-dependent relations, ignoring their non-local neighbors carrying important context; or approximate two words' syntactic relation by their relative distance on the dependency tree, sacrificing exactness. To address these issues, we propose global positional encoding for dependency tree, a new scheme that facilitates syntactic relation modeling between any two words with keeping exactness and without immediate neighbor constraint. Experiment results on NC11 German→English, English→German and WMT English→German datasets show that our approach is more effective than the above two strategies. In addition, our experiments quantitatively show that compared with higher layers, lower layers of the model are more proper places to incorporate syntax information in terms of each layer's preference to the syntactic pattern and the final performance.

syntactic position encoding وضع النحوية ترميز صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Explore Better Relative Position Embeddings from Encoding Perspective for Transformer Models

استكشاف أفضل تضمين المنافذ النسبي من منظور الترميز لنماذج المحولات

Ask ChatGPT about the research

Read More

suggested questions