Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

FrenLyS: A Tool for the Automatic Simplification of French General Language Texts

frenlys: أداة لتبسيط التلقائي لنصوص اللغة الفرنسية

648 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Lexical simplification (LS) aims at replacing words considered complex in a sentence by simpler equivalents. In this paper, we present the first automatic LS service for French, FrenLys, which offers different techniques to generate, select and rank substitutes. The paper describes the different methods proposed by our tool, which includes both classical approaches (e.g. generation of candidates from lexical resources, frequency filter, etc.) and more innovative approaches such as the exploitation of CamemBERT, a model for French based on the RoBERTa architecture. To evaluate the different methods, a new evaluation dataset for French is introduced.

References used

https://aclanthology.org/

rate research

BERTweetFR : Domain Adaptation of Pre-Trained Language Models for French Tweets

657 - Association for Computation Linguistics 2021 مقالة

We introduce BERTweetFR, the first large-scale pre-trained language model for French tweets. Our model is initialised using a general-domain French language model CamemBERT which follows the base architecture of BERT. Experiments show that BERTweetFR outperforms all previous general-domain French language models on two downstream Twitter NLP tasks of offensiveness identification and named entity recognition. The dataset used in the offensiveness detection task is first created and annotated by our team, filling in the gap of such analytic datasets in French. We make our model publicly available in the transformers library with the aim of promoting future research in analytic tasks for French tweets.

domain adaptation french tweets general-domain french language تكيف المجال تغريدات فرنسية اللغة الفرنسية المجال الفرنسية صناعة حمض الفوسفور المزيد..

TREMoLo-Tweets: A Multi-Label Corpus of French Tweets for Language Register Characterization

984 - Association for Computation Linguistics 2021 مقالة

The casual, neutral, and formal language registers are highly perceptible in discourse productions. However, they are still poorly studied in Natural Language Processing (NLP), especially outside English, and for new textual types like tweets. To sti mulate research, this paper introduces a large corpus of 228,505 French tweets (6M words) annotated in language registers. Labels are provided by a multi-label CamemBERT classifier trained and checked on a manually annotated subset of the corpus, while the tweets are selected to avoid undesired biases. Based on the corpus, an initial analysis of linguistic traits from either human annotators or automatic extractions is provided to describe the corpus and pave the way for various NLP tasks. The corpus, annotation guide and classifier are available on http://tremolo.irisa.fr.

language register characterization register characterization توصيف تسجيل اللغة تسجيل توصيف صناعة حمض الفوسفور

Automatic Detection and Classification of Mental Illnesses from General Social Media Texts

853 - Association for Computation Linguistics 2021 مقالة

Mental health is getting more and more attention recently, depression being a very common illness nowadays, but also other disorders like anxiety, obsessive-compulsive disorders, feeding disorders, autism, or attention-deficit/hyperactivity disorders . The huge amount of data from social media and the recent advances of deep learning models provide valuable means to automatically detecting mental disorders from plain text. In this article, we experiment with state-of-the-art methods on the SMHD mental health conditions dataset from Reddit (Cohan et al., 2018). Our contribution is threefold: using a dataset consisting of more illnesses than most studies, focusing on general text rather than mental health support groups and classification by posts rather than individuals or groups. For the automatic classification of the diseases, we employ three deep learning models: BERT, RoBERTa and XLNET. We double the baseline established by Cohan et al. (2018), on just a sample of their dataset. We improve the results obtained by Jiang et al. (2020) on post-level classification. The accuracy obtained by the eating disorder classifier is the highest due to the pregnant presence of discussions related to calories, diets, recipes etc., whereas depression had the lowest F1 score, probably because depression is more difficult to identify in linguistic acts.

general social media وسائل التواصل الاجتماعي العام صناعة حمض الفوسفور

Automatic Sentence Simplification in Low Resource Settings for Urdu

665 - Association for Computation Linguistics 2021 مقالة

To build automated simplification systems, corpora of complex sentences and their simplified versions is the first step to understand sentence complexity and enable the development of automatic text simplification systems. We present a lexical and sy ntactically simplified Urdu simplification corpus with a detailed analysis of the various simplification operations and human evaluation of corpus quality. We further analyze our corpora using text readability measures and present a comparison of the original, lexical simplified and syntactically simplified corpora. In addition, we compare our corpus with other existing simplification corpora by building simplification systems and evaluating these systems using BLEU and SARI scores. Our system achieves the highest BLEU score and comparable SARI score in comparison to other systems. We release our simplification corpora for the benefit of the research community.

low resource settings resource settings إعدادات الموارد المنخفضة إعدادات الموارد صناعة حمض الفوسفور

instructions in class of French Foreign Language : What do the teacher request ?

2118 - Tishreen University 2015 ورقة بحثية

In class, the teachers provide constantly instructions, and the learners must continually perform. But, some of these instructions are not followed ! It is the same of exercises or examination subjects. The instruction poses several challenges both in terms of its formulation or that of his understanding. Formulate an instruction require a great effort, great care and special skills, because of the quality of the latter depends in large part the quality of work performed.

teacher التوجيهات المدرس الطالب صياغة التوجيهات الصياغة الجيدة الصياغة السيئة instruction learner formulation of instructions good instruction bad instruction المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

FrenLyS: A Tool for the Automatic Simplification of French General Language Texts

frenlys: أداة لتبسيط التلقائي لنصوص اللغة الفرنسية

Ask ChatGPT about the research

Read More

suggested questions