No Arabic abstract
Crowdsourcing can identify high-quality solutions to problems; however, individual decisions are constrained by cognitive biases. We investigate some of these biases in an experimental model of a question-answering system. In both natural and controlled experiments, we observe a strong position bias in favor of answers appearing earlier in a list of choices. This effect is enhanced by three cognitive factors: the attention an answer receives, its perceived popularity, and cognitive load, measured by the number of choices a user has to process. While separately weak, these effects synergistically amplify position bias and decouple user choices of best answers from their intrinsic quality. We end our paper by discussing the novel ways we can apply these findings to substantially improve how high-quality answers are found in question-answering systems.
Wearable Cognitive Assistants (WCA) are anticipated to become a widely-used application class, in conjunction with emerging network infrastructures like 5G that incorporate edge computing capabilities. While prototypical studies of such applications exist today, the relationship between infrastructure service provisioning and its implication for WCA usability is largely unexplored despite the relevance that these applications have for future networks. This paper presents an experimental study assessing how WCA users react to varying end-to-end delays induced by the application pipeline or infrastructure. Participants interacted directly with an instrumented task-guidance WCA as delays were introduced into the system in a controllable fashion. System and task state were tracked in real time, and biometric data from wearable sensors on the participants were recorded. Our results show that periods of extended system delay cause users to correspondingly (and substantially) slow down in their guided task execution, an effect that persists for a time after the system returns to a more responsive state. Furthermore, the slow-down in task execution is correlated with a personality trait, neuroticism, associated with intolerance for time delays. We show that our results implicate impaired cognitive planning, as contrasted with resource depletion or emotional arousal, as the reason for slowed user task executions under system delay. The findings have several implications for the design and operation of WCA applications as well as computational and communication infrastructure, and additionally for the development of performance analysis tools for WCA.
The frequency with which people interact with technology means that users may develop interface habits, i.e. fast, automatic responses to stable interface cues. Design guidelines often assume that interface habits are beneficial. However, we lack quantitative evidence of how the development of habits actually affect user performance and an understanding of how changes in the interface design may affect habit development. Our work quantifies the effect of habit formation and disruption on user performance in interaction. Through a forced choice lab study task (n=19) and in the wild deployment (n=18) of a notificationdialog experiment on smartphones, we show that people become more accurate and faster at option selection as they develop an interface habit. Crucially this performance gain is entirely eliminated once the habit is disrupted. We discuss reasons for this performance shift and analyse some disadvantages of interface habits, outlining general design patterns on how to both support and disrupt them.Keywords: Interface habits, user behaviour, breaking habit, interaction science, quantitative research.
This paper studies joint models for selecting correct answer sentences among the top $k$ provided by answer sentence selection (AS2) modules, which are core components of retrieval-based Question Answering (QA) systems. Our work shows that a critical step to effectively exploit an answer set regards modeling the interrelated information between pair of answers. For this purpose, we build a three-way multi-classifier, which decides if an answer supports, refutes, or is neutral with respect to another one. More specifically, our neural architecture integrates a state-of-the-art AS2 model with the multi-classifier, and a joint layer connecting all components. We tested our models on WikiQA, TREC-QA, and a real-world dataset. The results show that our models obtain the new state of the art in AS2.
Distributional compositional (DisCo) models are functors that compute the meaning of a sentence from the meaning of its words. We show that DisCo models in the category of sets and relations correspond precisely to relational databases. As a consequence, we get complexity-theoretic reductions from semantics and entailment of a fragment of natural language to evaluation and containment of conjunctive queries, respectively. Finally, we define question answering as an NP-complete problem.
CQA services are collaborative platforms where users ask and answer questions. We investigate the influence of national culture on peoples online questioning and answering behavior. For this, we analyzed a sample of 200 thousand users in Yahoo Answers from 67 countries. We measure empirically a set of cultural metrics defined in Geert Hofstedes cultural dimensions and Robert Levines Pace of Life and show that behavioral cultural differences exist in community question answering platforms. We find that national cultures differ in Yahoo Answers along a number of dimensions such as temporal predictability of activities, contribution-related behavioral patterns, privacy concerns, and power inequality.