وثقت البحوث الحديثة أن النتائج التي تم الإبلاغ عنها في أوراق إسناد التأليف المتأتلة بشكل متكرر يصعب إنتاجها.غالبا ما يقترح الكود والبيانات التي يتعذر الوصول إليها كعوامل تمنع النسخ الناجحة.حتى عندما تتوفر المواد الأصلية، تظل المشكلات التي تمنع الباحثين من مقارنة فعالية طرق مختلفة.لحل المشاكل المتبقية --- عدم وجود مجموعات اختبار ثابت واستخدام كورسا متجانسة بشكل غير لائق --- ورقة لدينا تساهم مواد لخمس تجارب تحديد الهوية المؤقتة المغلقة.تتميز التجارب الخمس بنصوص من 106 مؤلفة متميزة.تشمل التجارب مجموعة من النثر الإنجليزي الأمريكي الأمريكي المعاصر.توفر هذه التجارب الأساس لأبحاث إسناد التأليف المشبعة والمؤثرات القابلة للتكرار التي تنطوي على كتابة معاصرة.
Recent research has documented that results reported in frequently-cited authorship attribution papers are difficult to reproduce. Inaccessible code and data are often proposed as factors which block successful reproductions. Even when original materials are available, problems remain which prevent researchers from comparing the effectiveness of different methods. To solve the remaining problems---the lack of fixed test sets and the use of inappropriately homogeneous corpora---our paper contributes materials for five closed-set authorship identification experiments. The five experiments feature texts from 106 distinct authors. Experiments involve a range of contemporary non-fiction American English prose. These experiments provide the foundation for comparable and reproducible authorship attribution research involving contemporary writing.
References used
https://aclanthology.org/
Authorship attribution is the task of assigning an unknown document to an author from a set of candidates. In the past, studies in this field use various evaluation datasets to demonstrate the effectiveness of preprocessing steps, features, and model
Cross-language authorship attribution is the challenging task of classifying documents by bilingual authors where the training documents are written in a different language than the evaluation documents. Traditional solutions rely on either translati
Determining whether two documents were composed by the same author, also known as authorship verification, has traditionally been tackled using statistical methods. Recently, authorship representations learned using neural networks have been found to
As NLP systems become better at detecting opinions and beliefs from text, it is important to ensure not only that models are accurate but also that they arrive at their predictions in ways that align with human reasoning. In this work, we present a m
The prophet's bibliography is considered as one of the oldest Islamic literature works.
These works are historically significant for two reasons. On the first hand, they are seen as
manuscripts documenting the life of the Holy Prophet Muhammad (pea