Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Optimizing a Supervised Classifier for a Difficult Language Identification Problem

تحسين مصنف إشراف لمشكلة تحديد اللغة الصعبة

855 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper describes the system developed by the Laboratoire d'analyse statistique des textes for the Dravidian Language Identification (DLI) shared task of VarDial 2021. This task is particularly difficult because the materials consists of short YouTube comments, written in Roman script, from three closely related Dravidian languages, and a fourth category consisting of several other languages in varying proportions, all mixed with English. The proposed system is made up of a logistic regression model which uses as only features n-grams of characters with a maximum length of 5. After its optimization both in terms of the feature weighting and the classifier parameters, it ranked first in the challenge. The additional analyses carried out underline the importance of optimization, especially when the measure of effectiveness is the Macro-F1.

References used

https://aclanthology.org/

rate research

Supervised Identification of Participant Slots in Contracts

866 - Association for Computation Linguistics 2021 مقالة

This paper presents a technique for the identification of participant slots in English language contracts. Taking inspiration from unsupervised slot extraction techniques, the system presented here uses a supervised approach to identify terms used to refer to a genre-specific slot in novel contracts. We evaluate the system in multiple feature configurations to demonstrate that the best performing system in both genres of contracts omits the exact mention form from consideration---even though such mention forms are often the name of the slot under consideration---and is instead based solely on the dependency label and parent; in other words, a more reliable quantification of a party's role in a contract is found in what they do rather than what they are named.

identification of participant participant slots english language contracts تحديد المشارك فتحات المشارك عقود اللغة الإنجليزية صناعة حمض الفوسفور المزيد..

A Language Model-based Generative Classifier for Sentence-level Discourse Parsing

754 - Association for Computation Linguistics 2021 مقالة

Discourse segmentation and sentence-level discourse parsing play important roles for various NLP tasks to consider textual coherence. Despite recent achievements in both tasks, there is still room for improvement due to the scarcity of labeled data. To solve the problem, we propose a language model-based generative classifier (LMGC) for using more information from labels by treating the labels as an input while enhancing label representations by embedding descriptions for each label. Moreover, since this enables LMGC to make ready the representations for labels, unseen in the pre-training step, we can effectively use a pre-trained language model in LMGC. Experimental results on the RST-DT dataset show that our LMGC achieved the state-of-the-art F1 score of 96.72 in discourse segmentation. It further achieved the state-of-the-art relation F1 scores of 84.69 with gold EDU boundaries and 81.18 with automatically segmented boundaries, respectively, in sentence-level discourse parsing.

sentence-level discourse parsing model-based generative classifier language model-based generative تحليل خطاب مستوى الجملة مصنف المولد النموذجي لغة اللغة القائمة على نموذج صناعة حمض الفوسفور المزيد..

Reconsidering the Past: Optimizing Hidden States in Language Models

809 - Association for Computation Linguistics 2021 مقالة

We present Hidden-State Optimization (HSO), a gradient-based method for improving the performance of transformer language models at inference time. Similar to dynamic evaluation (Krause et al., 2018), HSO computes the gradient of the log-probability the language model assigns to an evaluation text, but uses it to update the cached hidden states rather than the model parameters. We test HSO with pretrained Transformer-XL and GPT-2 language models, finding improvement on the WikiText-103 and PG-19 datasets in terms of perplexity, especially when evaluating a model outside of its training distribution. We also demonstrate downstream applicability by showing gains in the recently developed prompt-based few-shot evaluation setting, again with no extra parameters or training data.

optimizing hidden states reconsidering the past optimizing hidden تحسين الدول المخفية إعادة النظر في الماضي تحسين مخفي صناعة حمض الفوسفور المزيد..

A ResNet-50-Based Convolutional Neural Network Model for Language ID Identification from Speech Recordings

692 - Association for Computation Linguistics 2021 مقالة

This paper describes the model built for the SIGTYP 2021 Shared Task aimed at identifying 18 typologically different languages from speech recordings. Mel-frequency cepstral coefficients derived from audio files are transformed into spectrograms, whi ch are then fed into a ResNet-50-based CNN architecture. The final model achieved validation and test accuracies of 0.73 and 0.53, respectively.

خطاب آلي متعدد اللغات neural network model convolutional neural التنافيل الشبكة العصبية نموذج الشبكة العصبية التنافيل العصبي صناعة حمض الفوسفور المزيد..

Text Style Transfer: Leveraging a Style Classifier on Entangled Latent Representations

935 - Association for Computation Linguistics 2021 مقالة

Learning a good latent representation is essential for text style transfer, which generates a new sentence by changing the attributes of a given sentence while preserving its content. Most previous works adopt disentangled latent representation learn ing to realize style transfer. We propose a novel text style transfer algorithm with entangled latent representation, and introduce a style classifier that can regulate the latent structure and transfer style. Moreover, our algorithm for style transfer applies to both single-attribute and multi-attribute transfer. Extensive experimental results show that our method generally outperforms state-of-the-art approaches.

text style transfer style transfer latent representation نقل نمط النص نقل النمط التمثيل الكامن صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Optimizing a Supervised Classifier for a Difficult Language Identification Problem

تحسين مصنف إشراف لمشكلة تحديد اللغة الصعبة

Ask ChatGPT about the research

Read More

suggested questions