A Corpus for Modeling User and Language Effects in Argumentation on Online Debating


Abstract in English

Existing argumentation datasets have succeeded in allowing researchers to develop computational methods for analyzing the content, structure and linguistic features of argumentative text. They have been much less successful in fostering studies of the effect of user traits -- characteristics and beliefs of the participants -- on the debate/argument outcome as this type of user information is generally not available. This paper presents a dataset of 78, 376 debates generated over a 10-year period along with surprisingly comprehensive participant profiles. We also complete an example study using the dataset to analyze the effect of selected user traits on the debate outcome in comparison to the linguistic features typically employed in studies of this kind.

Download