Subscribe to the gold package and get unlimited access to Shamra Academy

Enriching plWordNet with morphology

إثراء Plwordnet مع التشكل

719 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

enriching plwordnet morphological information polish morphology إثراء plwordnet. المعلومات المورفولوجية المورفولوجيا البولندية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In the paper, we present the process of adding morphological information to the Polish WordNet (plWordNet). We describe the reasons for this connection and the intuitions behind it. We also draw attention to the specificity of the Polish morphology. We show in which tasks the morphological information is important and how the methods can be developed by extending them to include combined morphological information based on WordNet.

References used

https://aclanthology.org/

rate research

Edge: Enriching Knowledge Graph Embeddings with External Text

637 - Association for Computation Linguistics 2021 مقالة

Knowledge graphs suffer from sparsity which degrades the quality of representations generated by various methods. While there is an abundance of textual information throughout the web and many existing knowledge bases, aligning information across the se diverse data sources remains a challenge in the literature. Previous work has partially addressed this issue by enriching knowledge graph entities based on hard'' co-occurrence of words present in the entities of the knowledge graphs and external text, while we achieve soft'' augmentation by proposing a knowledge graph enrichment and embedding framework named Edge. Given an original knowledge graph, we first generate a rich but noisy augmented graph using external texts in semantic and structural level. To distill the relevant knowledge and suppress the introduced noise, we design a graph alignment term in a shared embedding space between the original graph and augmented graph. To enhance the embedding learning on the augmented graph, we further regularize the locality relationship of target entity based on negative sampling. Experimental results on four benchmark datasets demonstrate the robustness and effectiveness of Edge in link prediction and node classification.

enriching knowledge graph إثراء الرسم البياني المعرفة صناعة حمض الفوسفور

Enriching the Transformer with Linguistic Factors for Low-Resource Machine Translation

702 - Association for Computation Linguistics 2021 مقالة

Introducing factors, that is to say, word features such as linguistic information referring to the source tokens, is known to improve the results of neural machine translation systems in certain settings, typically in recurrent architectures. This st udy proposes enhancing the current state-of-the-art neural machine translation architecture, the Transformer, so that it allows to introduce external knowledge. In particular, our proposed modification, the Factored Transformer, uses linguistic factors that insert additional knowledge into the machine translation system. Apart from using different kinds of features, we study the effect of different architectural configurations. Specifically, we analyze the performance of combining words and features at the embedding level or at the encoder level, and we experiment with two different combination strategies. With the best-found configuration, we show improvements of 0.8 BLEU over the baseline Transformer in the IWSLT German-to-English task. Moreover, we experiment with the more challenging FLoRes English-to-Nepali benchmark, which includes both extremely low-resourced and very distant languages, and obtain an improvement of 1.2 BLEU

low-resource machine translation ترجمة آلة منخفضة الموارد صناعة حمض الفوسفور

Enriching the E2E dataset

1022 - Association for Computation Linguistics 2021 مقالة

This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG. We extract intermediate representations for popular pipeline tasks such as discourse ordering, text structuring, lexicaliza tion and referring expression generation, enabling researchers to rapidly develop and evaluate their data-to-text pipeline systems. The intermediate representations are extracted by aligning non-linguistic and text representations through a process called delexicalization, which consists in replacing input referring expressions to entities/attributes with placeholders. The enriched dataset is publicly available.

dataset enriching nlg DataSet. إثراء NLG. صناعة حمض الفوسفور المزيد..

A (Non)-Perfect Match: Mapping plWordNet onto PrincetonWordNet

975 - Association for Computation Linguistics 2021 مقالة

The paper reports on the methodology and final results of a large-scale synset mapping between plWordNet and Princeton WordNet. Dedicated manual and semi-automatic mapping procedures as well as interlingual relation types for nouns, verbs, adjectives and adverbs are described. The statistics of all types of interlingual relations are also provided.

perfect match perfect match تطابق مثالي في احسن الاحوال تطابق صناعة حمض الفوسفور المزيد..

CoDeRooMor: A new dataset for non-inflectional morphology studies of Swedish

773 - Association for Computation Linguistics 2021 مقالة

The paper introduces a new resource, CoDeRooMor, for studying the morphology of modern Swedish word formation. The approximately 16.000 lexical items in the resource have been manually segmented into word-formation morphemes, and labeled for their ca tegories, such as prefixes, suffixes, roots, etc. Word-formation mechanisms, such as derivation and compounding have been associated with each item on the list. The article describes the selection of items for manual annotation and the principles of annotation, reports on the reliability of the manual annotation, and presents tools, resources and some first statistics. Given the''gold'' nature of the resource, it is possible to use it for empirical studies as well as to develop linguistically-aware algorithms for morpheme segmentation and labeling (cf statistical subword approach). The resource will be made freely available.

non-inflectional morphology studies swedish word formation modern swedish word دراسات التشكل غير الانهيار تكوين كلمة سويدية الكلمة السويدية الحديثة صناعة حمض الفوسفور المزيد..

Enriching plWordNet with morphology

إثراء Plwordnet مع التشكل

Ask ChatGPT about the research

Read More

suggested questions