تم استخدام Growdsourcing بشكل مجيئي لتعليق مجموعات هائلة من البيانات.ومع ذلك، فإن العقبات الرئيسية التي تحول دون استخدام ملصقات من مصادر الحموشة هي ضوضاء وأخطاء من التعليقات الشرحية غير الخبراء.في هذا العمل، يقترح مقارنتين تتعامل مع الضوضاء والأخطاء في ملصقات الحشد.يستخدم النهج الأول تقليل الحد الأدنى على علم الحدة (SAM)، وهي تقنية التحسين بقوة بالملصقات الصاخبة.ينفد النهج الآخر على أن طبقة شبكة عصبية تدعى SoftMax-Crowdlayer مصممة خصيصا للتعلم من التعليقات التوضيحية من الحشد.وفقا للنتائج، يمكن للنهج المقترحة تحسين أداء نموذج الشبكة المتبقية الواسعة ونموذج التصور متعدد الطبقات المطبقة على مجموعات بيانات المصادر في الحشد في مجال معالجة الصور.كما أنه يحتوي على نتائج مماثلة ومقارنة مع تقنية التصويت الأغلبية عند تطبيقها على مجال البيانات المتسلسل حيث يتم استخدام تمثيلات التشفير الثنائية من المحولات (Bert) كطراز أساسي في كلا الحالتين.
Crowdsourcing has been ubiquitously used for annotating enormous collections of data. However, the major obstacles to using crowd-sourced labels are noise and errors from non-expert annotations. In this work, two approaches dealing with the noise and errors in crowd-sourced labels are proposed. The first approach uses Sharpness-Aware Minimization (SAM), an optimization technique robust to noisy labels. The other approach leverages a neural network layer called softmax-Crowdlayer specifically designed to learn from crowd-sourced annotations. According to the results, the proposed approaches can improve the performance of the Wide Residual Network model and Multi-layer Perception model applied on crowd-sourced datasets in the image processing domain. It also has similar and comparable results with the majority voting technique when applied to the sequential data domain whereby the Bidirectional Encoder Representations from Transformers (BERT) is used as the base model in both instances.
References used
https://aclanthology.org/
Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision. However, most supervised machine learning methods assume that a single preferred interpretati
This paper describes a system submitted by team BigGreen to LCP 2021 for predicting the lexical complexity of English words in a given context. We assemble a feature engineering-based model with a deep neural network model founded on BERT. While BERT
Among the tasks motivated by the proliferation of misinformation, propaganda detection is particularly challenging due to the deficit of fine-grained manual annotations required to train machine learning models. Here we show how data from other relat
Lexical complexity plays an important role in reading comprehension. lexical complexity prediction (LCP) can not only be used as a part of Lexical Simplification systems, but also as a stand-alone application to help people better reading. This paper
Humor detection has become a topic of interest for several research teams, especially those involved in socio-psychological studies, with the aim to detect the humor and the temper of a targeted population (e.g. a community, a city, a country, the em