New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Aligning Estonian and Russian news industry keywords with the help of subtitle translations and an environmental thesaurus

محاذاة الكلمات الرئيسية في صناعة الأخبار الإستونية والروسية بمساعدة الترجمات الفرعية وصنافس بيئية

42 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

estonian and russian environmental thesaurus aligning estonian الإستونية والروسية المرادفات البيئية محاذاة الإستونية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper presents the implementation of a bilingual term alignment approach developed by Repar et al. (2019) to a dataset of unaligned Estonian and Russian keywords which were manually assigned by journalists to describe the article topic. We started by separating the dataset into Estonian and Russian tags based on whether they are written in the Latin or Cyrillic script. Then we selected the available language-specific resources necessary for the alignment system to work. Despite the domains of the language-specific resources (subtitles and environment) not matching the domain of the dataset (news articles), we were able to achieve respectable results with manual evaluation indicating that almost 3/4 of the aligned keyword pairs are at least partial matches.

References used

https://aclanthology.org/

rate research

Optimizing Word Alignments with Better Subword Tokenization

264 - Association for Computation Linguistics 2021 مقالة

Word alignment identify translational correspondences between words in a parallel sentence pair and are used and for example and to train statistical machine translation and learn bilingual dictionaries or to perform quality estimation. Subword token ization has become a standard preprocessing step for a large number of applications and notably for state-of-the-art open vocabulary machine translation systems. In this paper and we thoroughly study how this preprocessing step interacts with the word alignment task and propose several tokenization strategies to obtain well-segmented parallel corpora. Using these new techniques and we were able to improve baseline word-based alignment models for six language pairs.

optimizing word alignments optimizing word subword tokenization تحسين محاذاة كلمة صفع الكلمات الفرعية صناعة حمض الفوسفور

Semi-Supervised and Unsupervised Sense Annotation via Translations

138 - Association for Computation Linguistics 2021 مقالة

Acquisition of multilingual training data continues to be a challenge in word sense disambiguation (WSD). To address this problem, unsupervised approaches have been proposed to automatically generate sense annotations for training supervised WSD syst ems. We present three new methods for creating sense-annotated corpora which leverage translations, parallel bitexts, lexical resources, as well as contextual and synset embeddings. Our semi-supervised method applies machine translation to transfer existing sense annotations to other languages. Our two unsupervised methods refine sense annotations produced by a knowledge-based WSD system via lexical translations in a parallel corpus. We obtain state-of-the-art results on standard WSD benchmarks.

sense annotations sense wsd الإحساس التوضيحية إحساس WSD. صناعة حمض الفوسفور المزيد..

Knowledge and Keywords Augmented Abstractive Sentence Summarization

327 - Association for Computation Linguistics 2021 مقالة

In this paper, we study the abstractive sentence summarization. There are two essential information features that can influence the quality of news summarization, which are topic keywords and the knowledge structure of the news text. Besides, the exi sting knowledge encoder has poor performance on sparse sentence knowledge structure. Considering these, we propose KAS, a novel Knowledge and Keywords Augmented Abstractive Sentence Summarization framework. Tri-encoders are utilized to integrate contexts of original text, knowledge structure and keywords topic simultaneously, with a special linearized knowledge structure. Automatic and human evaluations demonstrate that KAS achieves the best performances.

abstractive sentence summarization augmented abstractive sentence keywords augmented abstractive تلخيص الجملة الجماعية الجملة المبادرة المعزز الكلمات الرئيسية المعزز المبادرة صناعة حمض الفوسفور المزيد..

Neural News Recommendation with Collaborative News Encoding and Structural User Encoding

193 - Association for Computation Linguistics 2021 مقالة

Automatic news recommendation has gained much attention from the academic community and industry. Recent studies reveal that the key to this task lies within the effective representation learning of both news and users. Existing works typically encod e news title and content separately while neglecting their semantic interaction, which is inadequate for news text comprehension. Besides, previous models encode user browsing history without leveraging the structural correlation of user browsed news to reflect user interests explicitly. In this work, we propose a news recommendation framework consisting of collaborative news encoding (CNE) and structural user encoding (SUE) to enhance news and user representation learning. CNE equipped with bidirectional LSTMs encodes news title and content collaboratively with cross-selection and cross-attention modules to learn semantic-interactive news representations. SUE utilizes graph convolutional networks to extract cluster-structural features of user history, followed by intra-cluster and inter-cluster attention modules to learn hierarchical user interest representations. Experiment results on the MIND dataset validate the effectiveness of our model to improve the performance of news recommendation.

structural user encoding user user encoding تشفير المستخدم الهيكلي المستعمل ترميز المستخدم صناعة حمض الفوسفور المزيد..

Aligning Faithful Interpretations with their Social Attribution

88 - Association for Computation Linguistics 2021 مقالة

Abstract We find that the requirement of model interpretations to be faithful is vague and incomplete. With interpretation by textual highlights as a case study, we present several failure cases. Borrowing concepts from social science, we identify th at the problem is a misalignment between the causal chain of decisions (causal attribution) and the attribution of human behavior to the interpretation (social attribution). We reformulate faithfulness as an accurate attribution of causality to the model, and introduce the concept of aligned faithfulness: faithful causal chains that are aligned with their expected social behavior. The two steps of causal attribution and social attribution together complete the process of explaining behavior. With this formalization, we characterize various failures of misaligned faithful highlight interpretations, and propose an alternative causal chain to remedy the issues. Finally, we implement highlight explanations of the proposed causal format using contrastive explanations.

aligning faithful interpretations attribution social attribution محاذاة التفسيرات المؤمنة الإسناد الاجتماعي صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Aligning Estonian and Russian news industry keywords with the help of subtitle translations and an environmental thesaurus

محاذاة الكلمات الرئيسية في صناعة الأخبار الإستونية والروسية بمساعدة الترجمات الفرعية وصنافس بيئية

Ask ChatGPT about the research

Read More

suggested questions