New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Adapting Entities across Languages and Cultures

تكافح الكيانات عبر اللغات والثقافات

572 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

explain bill gates languages and cultures cultures اشرح بيل غيتس اللغات والثقافات الثقافات صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

How would you explain Bill Gates to a German? He is associated with founding a company in the United States, so perhaps the German founder Carl Benz could stand in for Gates in those contexts. This type of translation is called adaptation in the translation community. Until now, this task has not been done computationally. Automatic adaptation could be used in natural language processing for machine translation and indirectly for generating new question answering datasets and education. We propose two automatic methods and compare them to human results for this novel NLP task. First, a structured knowledge base adapts named entities using their shared properties. Second, vector-arithmetic and orthogonal embedding mappings methods identify better candidates, but at the expense of interpretable features. We evaluate our methods through a new dataset of human adaptations.

References used

https://aclanthology.org/

rate research

Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks

363 - Association for Computation Linguistics 2021 مقالة

Masked language models have quickly become the de facto standard when processing text. Recently, several approaches have been proposed to further enrich word representations with external knowledge sources such as knowledge graphs. However, these mod els are devised and evaluated in a monolingual setting only. In this work, we propose a language-independent entity prediction task as an intermediate training procedure to ground word representations on entity semantics and bridge the gap across different languages by means of a shared vocabulary of entities. We show that our approach effectively injects new lexical-semantic knowledge into neural models, improving their performance on different semantic tasks in the zero-shot crosslingual setting. As an additional advantage, our intermediate training does not require any supplementary input, allowing our models to be applied to new datasets right away. In our experiments, we use Wikipedia articles in up to 100 languages and already observe consistent gains compared to strong baselines when predicting entities using only the English Wikipedia. Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages.

grounding multilingual language grounding multilingual predicting wikipedia hyperlinks التأريض لغة متعددة اللغات التأريض متعدد اللغات التنبؤ بيكيبيديا الارتباطات التشعبية صناعة حمض الفوسفور المزيد..

Subword Mapping and Anchoring across Languages

350 - Association for Computation Linguistics 2021 مقالة

State-of-the-art multilingual systems rely on shared vocabularies that sufficiently cover all considered languages. To this end, a simple and frequently used approach makes use of subword vocabularies constructed jointly over several languages. We hy pothesize that such vocabularies are suboptimal due to false positives (identical subwords with different meanings across languages) and false negatives (different subwords with similar meanings). To address these issues, we propose Subword Mapping and Anchoring across Languages (SMALA), a method to construct bilingual subword vocabularies. SMALA extracts subword alignments using an unsupervised state-of-the-art mapping technique and uses them to create cross-lingual anchors based on subword similarities. We demonstrate the benefits of SMALA for cross-lingual natural language inference (XNLI), where it improves zero-shot transfer to an unseen language without task-specific data, but only by sharing subword embeddings. Moreover, in neural machine translation, we show that joint subword vocabularies obtained with SMALA lead to higher BLEU scores on sentences that contain many false positives and false negatives.

subword mapping and anchoring كلمة فرعية رسم الخرائط والرسوم صناعة حمض الفوسفور

Cultural and Geographical Influences on Image Translatability of Words across Languages

275 - Association for Computation Linguistics 2021 مقالة

Neural Machine Translation (NMT) models have been observed to produce poor translations when there are few/no parallel sentences to train the models. In the absence of parallel data, several approaches have turned to the use of images to learn transl ations. Since images of words, e.g., horse may be unchanged across languages, translations can be identified via images associated with words in different languages that have a high degree of visual similarity. However, translating via images has been shown to improve upon text-only models only marginally. To better understand when images are useful for translation, we study image translatability of words, which we define as the translatability of words via images, by measuring intra- and inter-cluster similarities of image representations of words that are translations of each other. We find that images of words are not always invariant across languages, and that language pairs with shared culture, meaning having either a common language family, ethnicity or religion, have improved image translatability (i.e., have more similar images for similar words) compared to its converse, regardless of their geographic proximity. In addition, in line with previous works that show images help more in translating concrete words, we found that concrete words have improved image translatability compared to abstract ones.

geographical influences cultural and geographical التأثيرات الجغرافية الثقافية والجغرافية كلمات صناعة حمض الفوسفور

Universal Joy A Data Set and Results for Classifying Emotions Across Languages

199 - Association for Computation Linguistics 2021 مقالة

While emotions are universal aspects of human psychology, they are expressed differently across different languages and cultures. We introduce a new data set of over 530k anonymized public Facebook posts across 18 languages, labeled with five differe nt emotions. Using multilingual BERT embeddings, we show that emotions can be reliably inferred both within and across languages. Zero-shot learning produces promising results for low-resource languages. Following established theories of basic emotions, we provide a detailed analysis of the possibilities and limits of cross-lingual emotion classification. We find that structural and typological similarity between languages facilitates cross-lingual learning, as well as linguistic diversity of training data. Our results suggest that there are commonalities underlying the expression of emotion in different languages. We publicly release the anonymized data for future research.

universal joy classifying emotions results for classifying الفرح العالمي تصنيف العواطف نتائج لتصنيف صناعة حمض الفوسفور المزيد..

Gender Bias in Natural Language Processing Across Human Languages

396 - Association for Computation Linguistics 2021 مقالة

Natural Language Processing (NLP) systems are at the heart of many critical automated decision-making systems making crucial recommendations about our future world. Gender bias in NLP has been well studied in English, but has been less studied in oth er languages. In this paper, a team including speakers of 9 languages - Chinese, Spanish, English, Arabic, German, French, Farsi, Urdu, and Wolof - reports and analyzes measurements of gender bias in the Wikipedia corpora for these 9 languages. We develop extensions to profession-level and corpus-level gender bias metric calculations originally designed for English and apply them to 8 other languages, including languages that have grammatically gendered nouns including different feminine, masculine, and neuter profession words. We discuss future work that would benefit immensely from a computational linguistics perspective.

مشكلة تقسيم زمرة language processing human languages معالجة اللغة لغات بشرية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Adapting Entities across Languages and Cultures

تكافح الكيانات عبر اللغات والثقافات

Ask ChatGPT about the research

Read More

suggested questions