توضح هذه الورقة أن تجميع التوقعات الجماعية التجميعية تستفيد من نمذجة المبررات المكتوبة المقدمة من المتنبئين.تشير تجاربنا إلى أن الأساسيات الأهمية والتصويت المرجحة تنافسية، وأن المبررات المكتوبة مفيدة لاستدعاء سؤال طوال حياته إلا في الربع الأخير.نقوم أيضا بإجراء تحليل الأخطاء ذرف الضوء في الخصائص التي تجعل مبرر غير موثوق بها.
This paper demonstrates that aggregating crowdsourced forecasts benefits from modeling the written justifications provided by forecasters. Our experiments show that the majority and weighted vote baselines are competitive, and that the written justifications are beneficial to call a question throughout its life except in the last quarter. We also conduct an error analysis shedding light into the characteristics that make a justification unreliable.
References used
https://aclanthology.org/
Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role in the network remains under-explored. We show that feed-forward layers in transformer-based language models operate as key-value memories, where each key c
We study the task of labeling covert or veiled toxicity in online conversations. Prior research has highlighted the difficulty in creating language models that recognize nuanced toxicity such as microaggressions. Our investigations further underscore
As AI reaches wider adoption, designing systems that are explainable and interpretable becomes a critical necessity. In particular, when it comes to dialogue systems, their reasoning must be transparent and must comply with human intuitions in order
Crowdsourcing from non-experts is one of the most common approaches to collecting data and annotations in NLP. Even though it is such a fundamental tool in NLP, crowdsourcing use is largely guided by common practices and the personal experience of re
In this paper, we investigate what types of stereotypical information are captured by pretrained language models. We present the first dataset comprising stereotypical attributes of a range of social groups and propose a method to elicit stereotypes