Do you want to publish a course? Click here

Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

Subformer: استكشاف تقاسم الوزن لكفاءة المعلمات في محولات التوليد

78   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Transformers have shown improved performance when compared to previous architectures for sequence processing such as RNNs. Despite their sizeable performance gains, as recently suggested, the model is computationally expensive to train and with a high parameter budget. In light of this, we explore parameter-sharing methods in Transformers with a specific focus on generative models. We perform an analysis of different parameter sharing/reduction methods and develop the Subformer. Our model combines sandwich-style parameter sharing, which overcomes naive cross-layer parameter sharing in generative models, and self-attentive embedding factorization (SAFE). Experiments on machine translation, abstractive summarization and language modeling show that the Subformer can outperform the Transformer even when using significantly fewer parameters.

References used
https://aclanthology.org/
rate research

Read More

Template filling is generally tackled by a pipeline of two separate supervised systems -- one for role-filler extraction and another for template/event recognition. Since pipelines consider events in isolation, they can suffer from error propagation. We introduce a framework based on end-to-end generative transformers for this task (i.e., GTT). It naturally models the dependence between entities both within a single event and across the multiple events described in a document. Experiments demonstrate that this framework substantially outperforms pipeline-based approaches, and other neural end-to-end baselines that do not model between-event dependencies. We further show that our framework specifically improves performance on documents containing multiple events.
In production sharing agreements, when discovering the commercial amount of oil, the investing company have the right to recover its costs which was carried in the exploration, development and production phases, that by utilizing from the proceeds of specified percent of each period production, Due to the views about the appropriate accounting treatment for capitalizing and recovering the costs under production sharing agreements were different; the accounting policies for processing and recovering the costs in the foreign petroleum companies which invested in Syria were different. The aim of this research is to present the different Accounting Treatments applied for Recovering the costs in the Oil and Gas Companies under Production Sharing Agreements, and determine its impact on the amount of capitalized costs and the income, that through an applied study included the accounting treatments in both (SIPC) and (CNPC) “Foreign petroleum companies invested in Syria”, the actual accounting numbers of SIPC modified to be in accordance with the accounting treatments of CNPC. The research completed to pay the attention that the Accounting Treatment of processing and recovering the Costs have a significant and material impact on the income and amount of capitalized costs, and preferred recognizing or treating the proceeds of recovering the costs as Oil Revenues not a recovery (amortizing) of capitalized costs.
There are common semantics shared across text and images. Given a sentence in a source language, whether depicting the visual scene helps translation into a target language? Existing multimodal neural machine translation methods (MNMT) require triple ts of bilingual sentence - image for training and tuples of source sentence - image for inference. In this paper, we propose ImagiT, a novel machine translation method via visual imagination. ImagiT first learns to generate visual representation from the source sentence, and then utilizes both source sentence and the imagined representation'' to produce a target translation. Unlike previous methods, it only needs the source sentence at the inference time. Experiments demonstrate that ImagiT benefits from visual imagination and significantly outperforms the text-only neural machine translation baselines. Further analysis reveals that the imagination process in ImagiT helps fill in missing information when performing the degradation strategy.
Emotion detection from social media posts has attracted noticeable attention from natural language processing (NLP) community in recent years. The ways for obtaining gold labels for training and testing of the systems for automatic emotion detection differ significantly from one study to another, and pose the question of reliability of gold labels and obtained classification results. This study systematically explores several ways for obtaining gold labels for Ekman's emotion model on Twitter data and the influence of the chosen strategy on the manual classification results.
To date, most abstractive summarisation models have relied on variants of the negative log-likelihood (NLL) as their training objective. In some cases, reinforcement learning has been added to train the models with an objective that is closer to thei r evaluation measures (e.g. ROUGE). However, the reward function to be used within the reinforcement learning approach can play a key role for performance and is still partially unexplored. For this reason, in this paper, we propose two reward functions for the task of abstractive summarisation: the first function, referred to as RwB-Hinge, dynamically selects the samples for the gradient update. The second function, nicknamed RISK, leverages a small pool of strong candidates to inform the reward. In the experiments, we probe the proposed approach by fine-tuning an NLL pre-trained model over nine summarisation datasets of diverse size and nature. The experimental results show a consistent improvement over the negative log-likelihood baselines.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا