Research papers, master and doctoral theses about معيار

Developing a Benchmark for Reducing Data Bias in Authorship Attribution

220 - Association for Computation Linguistics 2021 مقالة

Authorship attribution is the task of assigning an unknown document to an author from a set of candidates. In the past, studies in this field use various evaluation datasets to demonstrate the effectiveness of preprocessing steps, features, and model s. However, only a small fraction of works use more than one dataset to prove claims. In this paper, we present a collection of highly diverse authorship attribution datasets, which better generalizes evaluation results from authorship attribution research. Furthermore, we implement a wide variety of previously used machine learning models and show that many approaches show vastly different performances when applied to different datasets. We include pre-trained language models, for the first time testing them in this field in a systematic way. Finally, we propose a set of aggregated scores to evaluate different aspects of the dataset collection.

reducing data bias benchmark for reducing reducing data تقليل تحيز البيانات معيار للحد من تقليل البيانات صناعة حمض الفوسفور المزيد..

FANG-COVID: A New Large-Scale Benchmark Dataset for Fake News Detection in German

225 - Association for Computation Linguistics 2021 مقالة

As the world continues to fight the COVID-19 pandemic, it is simultaneously fighting an infodemic' -- a flood of disinformation and spread of conspiracy theories leading to health threats and the division of society. To combat this infodemic, there i s an urgent need for benchmark datasets that can help researchers develop and evaluate models geared towards automatic detection of disinformation. While there are increasing efforts to create adequate, open-source benchmark datasets for English, comparable resources are virtually unavailable for German, leaving research for the German language lagging significantly behind. In this paper, we introduce the new benchmark dataset FANG-COVID consisting of 28,056 real and 13,186 fake German news articles related to the COVID-19 pandemic as well as data on their propagation on Twitter. Furthermore, we propose an explainable textual- and social context-based model for fake news detection, compare its performance to black-box'' models and perform feature ablation to assess the relative importance of human-interpretable features in distinguishing fake news from authentic news.

large-scale benchmark dataset benchmark dataset benchmark dataset fang-covid مجموعة البيانات القياسية واسعة النطاق معيار DataSet. معيار DataSet Fang-Covid صناعة حمض الفوسفور المزيد..

Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval

343 - Association for Computation Linguistics 2021 مقالة

We present Mr. TyDi, a multi-lingual benchmark dataset for mono-lingual retrieval in eleven typologically diverse languages, designed to evaluate ranking with learned dense representations. The goal of this resource is to spur research in dense retri eval techniques in non-English languages, motivated by recent observations that existing techniques for representation learning perform poorly when applied to out-of-distribution data. As a starting point, we provide zero-shot baselines for this new dataset based on a multi-lingual adaptation of DPR that we call mDPR''. Experiments show that although the effectiveness of mDPR is much lower than BM25, dense representations nevertheless appear to provide valuable relevance signals, improving BM25 results in sparse--dense hybrids. In addition to analyses of our results, we also discuss future challenges and present a research agenda in multi-lingual dense retrieval. Mr. TyDi can be downloaded at https://github.com/castorini/mr.tydi.

multi-lingual benchmark multi-lingual benchmark dataset المعيار متعدد اللغات مجموعة بيانات معيار متعدد اللغات صناعة حمض الفوسفور

On nature and causes of observed MT errors

441 - Association for Computation Linguistics 2021 مقالة

This work describes analysis of nature and causes of MT errors observed by different evaluators under guidance of different quality criteria: adequacy and comprehension and and a not specified generic mixture of adequacy and fluency. We report result s for three language pairs and two domains and eleven MT systems. Our findings indicate that and despite the fact that some of the identified phenomena depend on domain and/or language and the following set of phenomena can be considered as generally challenging for modern MT systems: rephrasing groups of words and translation of ambiguous source words and translating noun phrases and and mistranslations. Furthermore and we show that the quality criterion also has impact on error perception. Our findings indicate that comprehension and adequacy can be assessed simultaneously by different evaluators and so that comprehension and as an important quality criterion and can be included more often in human evaluations.

errors observed work describes analysis quality criterion لاحظت أخطاء يصف العمل التحليل معيار الجودة صناعة حمض الفوسفور المزيد..

Benchmarking ASR Systems Based on Post-Editing Effort and Error Analysis

406 - Association for Computation Linguistics 2021 مقالة

This paper offers a comparative evaluation of four commercial ASR systems which are evaluated according to the post-editing effort required to reach publishable'' quality and according to the number of errors they produce. For the error annotation ta sk, an original error typology for transcription errors is proposed. This study also seeks to examine whether there is a difference in the performance of these systems between native and non-native English speakers. The experimental results suggest that among the four systems, Trint obtains the best scores. It is also observed that most systems perform noticeably better with native speakers and that all systems are most prone to fluency errors.

asr systems based benchmarking asr systems benchmarking asr أنظمة العصر مقرها معيار أنظمة ASR. معيار العسر صناعة حمض الفوسفور المزيد..

Towards Benchmarking the Utility of Explanations for Model Debugging

311 - Association for Computation Linguistics 2021 مقالة

Post-hoc explanation methods are an important class of approaches that help understand the rationale underlying a trained model's decision. But how useful are they for an end-user towards accomplishing a given task? In this vision paper, we argue the need for a benchmark to facilitate evaluations of the utility of post-hoc explanation methods. As a first step to this end, we enumerate desirable properties that such a benchmark should possess for the task of debugging text classifiers. Additionally, we highlight that such a benchmark facilitates not only assessing the effectiveness of explanations but also their efficiency.

benchmarking the utility post-hoc explanation methods trained model decision معيار الفائدة طرق تفسير ما بعد الهوك قرار النموذج المدربين صناعة حمض الفوسفور المزيد..

ONE: Toward ONE model, ONE algorithm, ONE corpus dedicated to sentiment analysis of Arabic/Arabizi and its dialects

203 - Association for Computation Linguistics 2021 مقالة

Arabic is the official language of 22 countries, spoken by more than 400 million speakers. Each one of this country use at least on dialect for daily life conversation. Then, Arabic has at least 22 dialects. Each dialect can be written in Arabic or A rabizi Scripts. The most recent researches focus on constructing a language model and a training corpus for each dialect, in each script. Following this technique means constructing 46 different resources (by including the Modern Standard Arabic, MSA) for handling only one language. In this paper, we extract ONE corpus, and we propose ONE algorithm to automatically construct ONE training corpus using ONE classification model architecture for sentiment analysis MSA and different dialects. After manually reviewing the training corpus, the obtained results outperform all the research literature results for the targeted test corpora.

كلمات الأغاني تنقل modern standard arabic corpus معيار الحديثة العربية جسم صناعة حمض الفوسفور

MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering

256 - Association for Computation Linguistics 2021 مقالة

Abstract Progress in cross-lingual modeling depends on challenging, realistic, and diverse evaluation sets. We introduce Multilingual Knowledge Questions and Answers (MKQA), an open- domain question answering evaluation set comprising 10k question-an swer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). Answers are based on heavily curated, language- independent data representation, making results comparable across languages and independent of language-specific passages. With 26 languages, this dataset supplies the widest range of languages to-date for evaluating question answering. We benchmark a variety of state- of-the-art methods and baselines for generative and extractive question answering, trained on Natural Questions, in zero shot and translation settings. Results indicate this dataset is challenging even in English, but especially in low-resource languages.1

linguistically diverse benchmark multilingual open domain domain question answering معيار متنوع لغويا مجال مفتوح متعدد اللغات إجابة سؤال المجال صناعة حمض الفوسفور المزيد..

Use Akaike (AIC) and Schwartz (SC) information criterions in the differentiation between nonlinear growth models of different fish species

2491 - Aِl-Baath University 2018 ورقة بحثية

The Akaike information criterion (AIC) is a measure of the relative quality of statistical models for a given set of data. The Schwarz Criterion (SC) is a measure to help in the selection between candidate models.

fish نماذج النمو غير الخطية للأسماك معيار المعلومات أكايك معيار المعلومات شوارتز nonlinear growth models Akaike information criterion (AIC) Schwartz information criterion (SC) المزيد..

متطلبات تطبيق معيار المحاسبة الدولي رقم 29 في الشركات المدرجة في سوق دمشق للأوراق المالية - دراسة ميدانية

586 - Damascus University 2017 رسالة ماجستير

متطلبات تطبيق معيار المحاسبة الدولي رقم 29 في الشركات المدرجة في سوق دمشق للأوراق المالية - دراسة ميدانية

التضخم معايير المحاسبة الدولية سوق الأوراق المالية معيار المحاسبة الدولي المصرف الدولي للتجارة والتمويل

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد