تعتبر عمليات تبادل الوسائط جزءا مهما في الاتصال، لكننا غالبا ما غمرت كثيرا مع الكثير من الحجج لمراكز مختلفة أو يتم التقاطها في فقاعات المرشح.الأدوات التي يمكن أن تقدم حجج قوية ذات صلة بالنفس يمكن أن تساعد في تقليل هذه المشاكل.لتكون قادرا على تقييم الخوارزميات التي يمكن أن تتنبأ بمدى مقنع الحجة، قمنا بجمع مجموعة بيانات بها أكثر من 900 حجج ومواقف شخصية تضم 600 فرد، والتي نقدمها في هذه الورقة.بناء على هذه البيانات، نقترح ثلاثة مهام توصية، التي نقدم النتائج التي نقدمها خطين أساسيين من مصنف أغلبية بسيطة وخوارزمية جارتين أكثر تعقيدا.تشير نتائجنا إلى أنه لا يزال من الممكن تطوير خوارزميات أفضل، وندعنا المجتمع لتحسين نتائجنا.
Exchanging arguments is an important part in communication, but we are often flooded with lots of arguments for different positions or are captured in filter bubbles. Tools which can present strong arguments relevant to oneself could help to reduce those problems. To be able to evaluate algorithms which can predict how convincing an argument is, we have collected a dataset with more than 900 arguments and personal attitudes of 600 individuals, which we present in this paper. Based on this data, we suggest three recommender tasks, for which we provide two baseline results from a simple majority classifier and a more complex nearest-neighbor algorithm. Our results suggest that better algorithms can still be developed, and we invite the community to improve on our results.
References used
https://aclanthology.org/
Counterfactual statements describe events that did not or cannot take place. We consider the problem of counterfactual detection (CFD) in product reviews. For this purpose, we annotate a multilingual CFD dataset from Amazon product reviews covering c
Precisely defining the terminology is the first step in scientific communication. Developing neural text generation models for definition generation can circumvent the labor-intensity curation, further accelerating scientific discovery. Unfortunately
In translating text where sentiment is the main message, human translators give particular attention to sentiment-carrying words. The reason is that an incorrect translation of such words would miss the fundamental aspect of the source text, i.e. the
We present GerDaLIR, a German Dataset for Legal Information Retrieval based on case documents from the open legal information platform Open Legal Data. The dataset consists of 123K queries, each labelled with at least one relevant document in a colle
This paper describes the annotation process of an offensive language data set for Romanian on social media. To facilitate comparable multi-lingual research on offensive language, the annotation guidelines follow some of the recent annotation efforts