New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Classifying Divergences in Cross-lingual AMR Pairs

تصنيف الاختلافات في أزواج عمرو عبر اللغات

213 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Translation divergences are varied and widespread, challenging approaches that rely on parallel text. To annotate translation divergences, we propose a schema grounded in the Abstract Meaning Representation (AMR), a sentence-level semantic framework instantiated for a number of languages. By comparing parallel AMR graphs, we can identify specific points of divergence. Each divergence is labeled with both a type and a cause. We release a small corpus of annotated English-Spanish data, and analyze the annotations in our corpus.

References used

https://aclanthology.org/

rate research

Translate, then Parse! A Strong Baseline for Cross-Lingual AMR Parsing

413 - Association for Computation Linguistics 2021 مقالة

In cross-lingual Abstract Meaning Representation (AMR) parsing, researchers develop models that project sentences from various languages onto their AMRs to capture their essential semantic structures: given a sentence in any language, we aim to captu re its core semantic content through concepts connected by manifold types of semantic relations. Methods typically leverage large silver training data to learn a single model that is able to project non-English sentences to AMRs. However, we find that a simple baseline tends to be overlooked: translating the sentences to English and projecting their AMR with a monolingual AMR parser (translate+parse,T+P). In this paper, we revisit this simple two-step base-line, and enhance it with a strong NMT system and a strong AMR parser. Our experiments show that T+P outperforms a recent state-of-the-art system across all tested languages: German, Italian, Spanish and Mandarin with +14.6, +12.6, +14.3 and +16.0 Smatch points

تحليل عمرو cross-lingual amr parsing cross-lingual abstract meaning تحليل AMR عبر اللغات معنى مجردة عبر اللغات صناعة حمض الفوسفور

Model Selection for Cross-lingual Transfer

538 - Association for Computation Linguistics 2021 مقالة

Transformers that are pre-trained on multilingual corpora, such as, mBERT and XLM-RoBERTa, have achieved impressive cross-lingual transfer capabilities. In the zero-shot transfer setting, only English training data is used, and the fine-tuned model i s evaluated on another target language. While this works surprisingly well, substantial variance has been observed in target language performance between different fine-tuning runs, and in the zero-shot setup, no target-language development data is available to select among multiple fine-tuned models. Prior work has relied on English dev data to select among models that are fine-tuned with different learning rates, number of steps and other hyperparameters, often resulting in suboptimal choices. In this paper, we show that it is possible to select consistently better models when small amounts of annotated data are available in auxiliary pivot languages. We propose a machine learning approach to model selection that uses the fine-tuned model's own internal representations to predict its cross-lingual capabilities. In extensive experiments we find that this method consistently selects better models than English validation data across twenty five languages (including eight low-resource languages), and often achieves results that are comparable to model selection using target language development data.

آلة متعددة اللغات قوية cross-lingual transfer capabilities impressive cross-lingual transfer قدرات النقل اللغوي نقل متبرع عبر اللغات صناعة حمض الفوسفور

CL-MoNoise: Cross-lingual Lexical Normalization

334 - Association for Computation Linguistics 2021 مقالة

Social media is notoriously difficult to process for existing natural language processing tools, because of spelling errors, non-standard words, shortenings, non-standard capitalization and punctuation. One method to circumvent these issues is to nor malize input data before processing. Most previous work has focused on only one language, which is mostly English. In this paper, we are the first to propose a model for cross-lingual normalization, with which we participate in the WNUT 2021 shared task. To this end, we use MoNoise as a starting point, and make a simple adaptation for cross-lingual application. Our proposed model outperforms the leave-as-is baseline provided by the organizers which copies the input. Furthermore, we explore a completely different model which converts the task to a sequence labeling task. Performance of this second system is low, as it does not take capitalization into account in our implementation.

cross-lingual lexical normalization cross-lingual lexical التطبيع المعجمي عبر اللغات المعجمات اللغوية صناعة حمض الفوسفور

Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks

388 - Association for Computation Linguistics 2021 مقالة

It is now established that modern neural language models can be successfully trained on multiple languages simultaneously without changes to the underlying architecture, providing an easy way to adapt a variety of NLP models to low-resource languages . But what kind of knowledge is really shared among languages within these models? Does multilingual training mostly lead to an alignment of the lexical representation spaces or does it also enable the sharing of purely grammatical knowledge? In this paper we dissect different forms of cross-lingual transfer and look for its most determining factors, using a variety of models and probing tasks. We find that exposing our LMs to a related language does not always increase grammatical knowledge in the target language, and that optimal conditions for lexical-semantic transfer may not be optimal for syntactic transfer.

محاذاة كلمة عالية الجودة multilingual recurrent neural الشبكات العصبية المتكررة العصبية المتكررة متعددة اللغات صناعة حمض الفوسفور

Multilingual AMR Parsing with Noisy Knowledge Distillation

500 - Association for Computation Linguistics 2021 مقالة

We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR parser by using an existing English parser as its teacher. We constrain our exploration in a strict multilingua l setting: there is but one model to parse all different languages including English. We identify that noisy input and precise output are the key to successful distillation. Together with extensive pre-training, we obtain an AMR parser whose performances surpass all previously published results on four different foreign languages, including German, Spanish, Italian, and Chinese, by large margins (up to 18.8 Smatch points on Chinese and on average 11.3 Smatch points). Our parser also achieves comparable performance on English to the latest state-of-the-art English-only parser.

multilingual amr parsing noisy knowledge distillation تحليل عمرو متعدد اللغات تقطير المعرفة صاخبة صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Classifying Divergences in Cross-lingual AMR Pairs

تصنيف الاختلافات في أزواج عمرو عبر اللغات

Ask ChatGPT about the research

Read More

suggested questions