Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Detection of Puffery on the English Wikipedia

اكتشاف منتفخ على Wikipedia الإنجليزية

715 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

english wikipedia wikipedia الإنجليزية ويكيبيديا ويكيبيديا إنجليزي صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

On Wikipedia, an online crowdsourced encyclopedia, volunteers enforce the encyclopedia's editorial policies. Wikipedia's policy on maintaining a neutral point of view has inspired recent research on bias detection, including weasel words'' and hedges''. Yet to date, little work has been done on identifying puffery,'' phrases that are overly positive without a verifiable source. We demonstrate that collecting training data for this task requires some care, and construct a dataset by combining Wikipedia editorial annotations and information retrieval techniques. We compare several approaches to predicting puffery, and achieve 0.963 f1 score by incorporating citation features into a RoBERTa model. Finally, we demonstrate how to integrate our model with Wikipedia's public infrastructure to give back to the Wikipedia editor community.

References used

https://aclanthology.org/

rate research

English-Arabic Cross-language Plagiarism Detection

858 - Association for Computation Linguistics 2021 مقالة

The advancement of the web and information technology has contributed to the rapid growth of digital libraries and automatic machine translation tools which easily translate texts from one language into another. These have increased the content acces sible in different languages, which results in easily performing translated plagiarism, which are referred to as cross-language plagiarism''. Recognition of plagiarism among texts in different languages is more challenging than identifying plagiarism within a corpus written in the same language. This paper proposes a new technique for enhancing English-Arabic cross-language plagiarism detection at the sentence level. This technique is based on semantic and syntactic feature extraction using word order, word embedding and word alignment with multilingual encoders. Those features, and their combination with different machine learning (ML) algorithms, are then used in order to aid the task of classifying sentences as either plagiarized or non-plagiarized. The proposed approach has been deployed and assessed using datasets presented at SemEval-2017. Analysis of experimental data demonstrates that utilizing extracted features and their combinations with various ML classifiers achieves promising results.

cross-language plagiarism detection english-arabic cross-language plagiarism cross-language plagiarism الكشف عن الانتحال باللغة عبر اللغة الانتحال الإنجليزية والعربية الانتحال عبر اللغة صناعة حمض الفوسفور المزيد..

HateBERT: Retraining BERT for Abusive Language Detection in English

1084 - Association for Computation Linguistics 2021 مقالة

We introduce HateBERT, a re-trained BERT model for abusive language detection in English. The model was trained on RAL-E, a large-scale dataset of Reddit comments in English from communities banned for being offensive, abusive, or hateful that we hav e curated and made available to the public. We present the results of a detailed comparison between a general pre-trained language model and the retrained version on three English datasets for offensive, abusive language and hate speech detection tasks. In all datasets, HateBERT outperforms the corresponding general BERT model. We also discuss a battery of experiments comparing the portability of the fine-tuned models across the datasets, suggesting that portability is affected by compatibility of the annotated phenomena.

abusive language detection retraining bert abusive language الكشف عن اللغة المسيئة إعادة تدريب بيرت لغة مسيئة صناعة حمض الفوسفور المزيد..

WEC: Deriving a Large-scale Cross-document Event Coreference dataset from Wikipedia

679 - Association for Computation Linguistics 2021 مقالة

Cross-document event coreference resolution is a foundational task for NLP applications involving multi-text processing. However, existing corpora for this task are scarce and relatively small, while annotating only modest-size clusters of documents belonging to the same topic. To complement these resources and enhance future research, we present Wikipedia Event Coreference (WEC), an efficient methodology for gathering a large-scale dataset for cross-document event coreference from Wikipedia, where coreference links are not restricted within predefined topics. We apply this methodology to the English Wikipedia and extract our large-scale WEC-Eng dataset. Notably, our dataset creation method is generic and can be applied with relatively little effort to other Wikipedia languages. To set baseline results, we develop an algorithm that adapts components of state-of-the-art models for within-document coreference resolution to the cross-document setting. Our model is suitably efficient and outperforms previously published state-of-the-art results for the task.

المهام تحسين صفر النار event coreference cross-document event الحدث comeference. حدث عبر المستندات صناعة حمض الفوسفور

Exploring the Integration of E2E ASR and Pronunciation Modeling for English Mispronunciation Detection

1006 - Association for Computation Linguistics 2021 مقالة

There has been increasing demand to develop effective computer-assisted language training (CAPT) systems, which can provide feedback on mispronunciations and facilitate second-language (L2) learners to improve their speaking proficiency through repea ted practice. Due to the shortage of non-native speech for training the automatic speech recognition (ASR) module of a CAPT system, the corresponding mispronunciation detection performance is often affected by imperfect ASR. Recognizing this importance, we in this paper put forward a two-stage mispronunciation detection method. In the first stage, the speech uttered by an L2 learner is processed by an end-to-end ASR module to produce N-best phone sequence hypotheses. In the second stage, these hypotheses are fed into a pronunciation model which seeks to faithfully predict the phone sequence hypothesis that is most likely pronounced by the learner, so as to improve the performance of mispronunciation detection. Empirical experiments conducted a English benchmark dataset seem to confirm the utility of our method.

exploring the integration mispronunciation detection pronunciation modeling استكشاف التكامل اكتشاف أخطاء أخطاء النمذجة النطق صناعة حمض الفوسفور المزيد..

The Attitudes of Fourth –year Students at the Department of English towards Learning and Using English

1223 - Aِl-Baath University 2017 رسالة ماجستير

Nowadays social-psychological variables , like attitudes and motivation, gender, aptitude, etc. have been established as influential factors in the process of learning a foreign language . Therefore, this research aims at measuring the attitudes of f ourth-year students at the Department of English towards learning English

قسم اللغة الإنجليزية the Department of English Using English learning English

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Detection of Puffery on the English Wikipedia

اكتشاف منتفخ على Wikipedia الإنجليزية

Ask ChatGPT about the research

Read More

suggested questions