Subscribe to the gold package and get unlimited access to Shamra Academy

The Icelandic Word Web: A language technology-focused redesign of a lexicosemantic database

Word Word الأيسلندية: إعادة تصميم تركز على تكنولوجيا اللغة لقاعدة بيانات معجمية

357 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

icelandic word web word web lexicosemantic word web كلمة أيسلندية الويب كلمة على شبكة الإنترنت صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The new Icelandic Word Web (IW) is a language technology focused redesign of a lexicosemantic database of semantically related entries. The IW's entities, relations, metadata and categorization scheme have all been implemented from scratch in two systems, OntoLex and SKOS. After certain adjustments were made to OntoLex and SKOS interoperability, it was also possible to implement specific IW features that, while potentially nonstandard, form an integral part of the Word Web's lexicosemantic functionality. Also new in this implementation are access to a larger amount of linguistic data, a greater variety of search options, the possibility of automated processing, and the ability to conduct research through SPARQL without possessing a mastery of Icelandic.

References used

https://aclanthology.org/

rate research

Field Embedding: A Unified Grain-Based Framework for Word Representation

959 - Association for Computation Linguistics 2021 مقالة

Word representations empowered with additional linguistic information have been widely studied and proved to outperform traditional embeddings. Current methods mainly focus on learning embeddings for words while embeddings of linguistic information ( referred to as grain embeddings) are discarded after the learning. This work proposes a framework field embedding to jointly learn both word and grain embeddings by incorporating morphological, phonetic, and syntactical linguistic fields. The framework leverages an innovative fine-grained pipeline that integrates multiple linguistic fields and produces high-quality grain sequences for learning supreme word representations. A novel algorithm is also designed to learn embeddings for words and grains by capturing information that is contained within each field and that is shared across them. Experimental results of lexical tasks and downstream natural language processing tasks illustrate that our framework can learn better word embeddings and grain embeddings. Qualitative evaluations show grain embeddings effectively capture the semantic information.

unified grain-based framework unified grain-based word representations الإطار الموحد القائمة على الحبوب القائم على الحبوب الموحدة تمثيلات كلمة صناعة حمض الفوسفور المزيد..

Applied Language Technology: NLP for the Humanities

421 - Association for Computation Linguistics 2021 مقالة

This contribution describes a two-course module that seeks to provide humanities majors with a basic understanding of language technology and its applications using Python. The learning materials consist of interactive Jupyter Notebooks and accompanying YouTube videos, which are openly available with a Creative Commons licence.

applied language technology language technology applied language تكنولوجيا اللغة التطبيقية تكنولوجيا اللغة اللغة التطبيقية صناعة حمض الفوسفور المزيد..

Denoising Word Embeddings by Averaging in a Shared Space

983 - Association for Computation Linguistics 2021 مقالة

We introduce a new approach for smoothing and improving the quality of word embeddings. We consider a method of fusing word embeddings that were trained on the same corpus but with different initializations. We project all the models to a shared vect or space using an efficient implementation of the Generalized Procrustes Analysis (GPA) procedure, previously used in multilingual word translation. Our word representation demonstrates consistent improvements over the raw models as well as their simplistic average, on a range of tasks. As the new representations are more stable and reliable, there is a noticeable improvement in rare word evaluations.

denoising word embeddings generalized procrustes analysis Denoising Word Embeddings. تحليل عائلي عمومي صناعة حمض الفوسفور

Query2Prod2Vec: Grounded Word Embeddings for eCommerce

803 - Association for Computation Linguistics 2021 مقالة

We present Query2Prod2Vec, a model that grounds lexical representations for product search in product embeddings: in our model, meaning is a mapping between words and a latent space of products in a digital shop. We leverage shopping sessions to lear n the underlying space and use merchandising annotations to build lexical analogies for evaluation: our experiments show that our model is more accurate than known techniques from the NLP and IR literature. Finally, we stress the importance of data efficiency for product search outside of retail giants, and highlight how Query2Prod2Vec fits with practical constraints faced by most practitioners.

grounded word embeddings grounded word كلمة تضيحية كلمة كلمة أساسية صناعة حمض الفوسفور

Malta National Language Technology Platform: A vision for enhancing Malta's official languages using Machine Translation

890 - Association for Computation Linguistics 2021 مقالة

In this paper we introduce a vision towards establishing the Malta National Language Technology Platform; an ongoing effort that aims to provide a basis for enhancing Malta's official languages, namely Maltese and English, using Machine Translation. This will contribute towards the current niche of Language Technology support for the Maltese low-resource language, across multiple computational linguistics fields, such as speech processing, machine translation, text analysis, and multi-modal resources. The end goals are to remove language barriers, increase accessibility, foster cross-border services, and most importantly to facilitate the preservation of the Maltese language.

language technology platform malta national language national language technology منصة تكنولوجيا اللغة لغة مالطا الوطنية تكنولوجيا اللغة الوطنية صناعة حمض الفوسفور المزيد..

The Icelandic Word Web: A language technology-focused redesign of a lexicosemantic database

Word Word الأيسلندية: إعادة تصميم تركز على تكنولوجيا اللغة لقاعدة بيانات معجمية

Ask ChatGPT about the research

Read More

suggested questions