Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation

صهر الذهب والفضة لتحسين الجيل متعدد اللغات AMR إلى النص

279 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

improved multilingual silver amr smelting gold تحسين متعدد اللغات AMR الفضية صهر الذهب صناعة حمض الفوسفور

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Recent work on multilingual AMR-to-text generation has exclusively focused on data augmentation strategies that utilize silver AMR. However, this assumes a high quality of generated AMRs, potentially limiting the transferability to the target task. In this paper, we investigate different techniques for automatically generating AMR annotations, where we aim to study which source of information yields better multilingual results. Our models trained on gold AMR with silver (machine translated) sentences outperform approaches which leverage generated silver AMR. We find that combining both complementary sources of information further improves multilingual AMR-to-text generation. Our models surpass the previous state of the art for German, Italian, Spanish, and Chinese by a large margin.

References used

https://aclanthology.org/

rate research

SAPPHIRE: Approaches for Enhanced Concept-to-Text Generation

561 - Association for Computation Linguistics 2021 مقالة

We motivate and propose a suite of simple but effective improvements for concept-to-text generation called SAPPHIRE: Set Augmentation and Post-hoc PHrase Infilling and REcombination. We demonstrate their effectiveness on generative commonsense reason ing, a.k.a. the CommonGen task, through experiments using both BART and T5 models. Through extensive automatic and human evaluation, we show that SAPPHIRE noticeably improves model performance. An in-depth qualitative analysis illustrates that SAPPHIRE effectively addresses many issues of the baseline model generations, including lack of commonsense, insufficient specificity, and poor fluency.

approaches for enhanced post-hoc phrase infilling generation called sapphire نهج لتعزيز عبارة ما بعد الهي جيل يسمى الياقوت صناعة حمض الفوسفور المزيد..

Referenceless Parsing-Based Evaluation of AMR-to-English Generation

413 - Association for Computation Linguistics 2021 مقالة

Reference-based automatic evaluation metrics are notoriously limited for NLG due to their inability to fully capture the range of possible outputs. We examine a referenceless alternative: evaluating the adequacy of English sentences generated from Ab stract Meaning Representation (AMR) graphs by parsing into AMR and comparing the parse directly to the input. We find that the errors introduced by automatic AMR parsing substantially limit the effectiveness of this approach, but a manual editing study indicates that as parsing improves, parsing-based evaluation has the potential to outperform most reference-based metrics.

ذات دلالة إحصائية referenceless parsing-based evaluation parsing-based evaluation التقييم القائم على تحليل الحدوث التقييم القائم على التحليل صناعة حمض الفوسفور

Efficient Multilingual Text Classification for Indian Languages

668 - Association for Computation Linguistics 2021 مقالة

India is one of the richest language hubs on the earth and is very diverse and multilingual. But apart from a few Indian languages, most of them are still considered to be resource poor. Since most of the NLP techniques either require linguistic know ledge that can only be developed by experts and native speakers of that language or they require a lot of labelled data which is again expensive to generate, the task of text classification becomes challenging for most of the Indian languages. The main objective of this paper is to see how one can benefit from the lexical similarity found in Indian languages in a multilingual scenario. Can a classification model trained on one Indian language be reused for other Indian languages? So, we performed zero-shot text classification via exploiting lexical similarity and we observed that our model performs best in those cases where the vocabulary overlap between the language datasets is maximum. Our experiments also confirm that a single multilingual model trained via exploiting language relatedness outperforms the baselines by significant margins.

indian languages indian efficient multilingual text اللغات الهندية هندي كفاءة النص متعدد اللغات صناعة حمض الفوسفور المزيد..

mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

374 - Association for Computation Linguistics 2021 مقالة

The recent Text-to-Text Transfer Transformer'' (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 th at was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. We also describe a simple technique to prevent accidental translation'' in the zero-shot setting, where a generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model checkpoints used in this work are publicly available.

massively multilingual pre-trained english-language nlp tasks transfer transformer متعدد اللغات بشكل كبير مدرب مسبقا مهام NLP اللغة الإنجليزية نقل المحولات صناعة حمض الفوسفور المزيد..

Stacked AMR Parsing with Silver Data

302 - Association for Computation Linguistics 2021 مقالة

Lacking sufficient human-annotated data is one main challenge for abstract meaning representation (AMR) parsing. To alleviate this problem, previous works usually make use of silver data or pre-trained language models. In particular, one recent seq-t o-seq work directly fine-tunes AMR graph sequences on the encoder-decoder pre-trained language model and achieves new state-of-the-art results, outperforming previous works by a large margin. However, it makes the decoding relatively slower. In this work, we investigate alternative approaches to achieve competitive performance at faster speeds. We propose a simplified AMR parser and a pre-training technique for the effective usage of silver data. We conduct extensive experiments on the widely used AMR2.0 dataset and the results demonstrate that our Transformer-based AMR parser achieves the best performance among the seq2graph-based models. Furthermore, with silver data, our model achieves competitive results with the SOTA model, and the speed is an order of magnitude faster. Detailed analyses are conducted to gain more insights into our proposed model and the effectiveness of the pre-training technique.

stacked amr parsing silver data stacked amr تحليل عمرو مكدسة البيانات الفضية مكدسة عمرو صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation

صهر الذهب والفضة لتحسين الجيل متعدد اللغات AMR إلى النص

Ask ChatGPT about the research

Read More

suggested questions