Do you want to publish a course? Click here

The Wisdom of the Few? Supertaggers in Collaborative Tagging Systems

124   0   0.0 ( 0 )
 Added by Jared Lorince
 Publication date 2015
and research's language is English




Ask ChatGPT about the research

A folksonomy is ostensibly an information structure built up by the wisdom of the crowd, but is the crowd really doing the work? Tagging is in fact a sharply skewed process in which a small minority of supertagger users generate an overwhelming majority of the annotations. Using data from three large-scale social tagging platforms, we explore (a) how to best quantify the imbalance in tagging behavior and formally define a supertagger, (b) how supertaggers differ from other users in their tagging patterns, and (c) if effects of motivation and expertise inform our understanding of what makes a supertagger. Our results indicate that such prolific users not only tag more than their counterparts, but in quantifiably different ways. Specifically, we find that supertaggers are more likely to label content in the long tail of less popular items, that they show differences in patterns of content tagged and terms utilized, and are measurably different with respect to tagging expertise and motivation. These findings suggest we should question the extent to which folksonomies achieve crowdsourced classification via the wisdom of the crowd, especially for broad folksonomies like Last.fm as opposed to narrow folksonomies like Flickr.



rate research

Read More

Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.
We analyze a large-scale snapshot of del.icio.us and investigate how the number of different tags in the system grows as a function of a suitably defined notion of time. We study the temporal evolution of the global vocabulary size, i.e. the number of distinct tags in the entire system, as well as the evolution of local vocabularies, that is the growth of the number of distinct tags used in the context of a given resource or user. In both cases, we find power-law behaviors with exponents smaller than one. Surprisingly, the observed growth behaviors are remarkably regular throughout the entire history of the system and across very different resources being bookmarked. Similar sub-linear laws of growth have been observed in written text, and this qualitative universality calls for an explanation and points in the direction of non-trivial cognitive processes in the complex interaction patterns characterizing collaborative tagging.
Human groups can perform extraordinary accurate estimations compared to individuals by simply using the mean, median or geometric mean of the individual estimations [Galton 1907, Surowiecki 2005, Page 2008]. However, this is true only for some tasks and in general these collective estimations show strong biases. The method fails also when allowing for social interactions, which makes the collective estimation worse as individuals tend to converge to the biased result [Lorenz et al. 2011]. Here we show that there is a bright side of this apparently negative impact of social interactions into collective intelligence. We found that some individuals resist the social influence and, when using the median of this subgroup, we can eliminate the bias of the wisdom of the full crowd. To find this subgroup of individuals more confident in their private estimations than in the social influence, we model individuals as estimators that combine private and social information with different relative weights [Perez-Escudero & de Polavieja 2011, Arganda et al. 2012]. We then computed the geometric mean for increasingly smaller groups by eliminating those using in their estimations higher values of the social influence weight. The trend obtained in this procedure gives unbiased results, in contrast to the simpler method of computing the median of the complete group. Our results show that, while a simple operation like the mean, median or geometric mean of a group may not allow groups to make good estimations, a more complex operation taking into account individuality in the social dynamics can lead to a better collective intelligence.
In this work, we study the utility of graph embeddings to generate latent user representations for trust-based collaborative filtering. In a cold-start setting, on three publicly available datasets, we evaluate approaches from four method families: (i) factorization-based, (ii) random walk-based, (iii) deep learning-based, and (iv) the Large-scale Information Network Embedding (LINE) approach. We find that across the four families, random-walk-based approaches consistently achieve the best accuracy. Besides, they result in highly novel and diverse recommendations. Furthermore, our results show that the use of graph embeddings in trust-based collaborative filtering significantly improves user coverage.
Folksonomies provide a rich source of data to study social patterns taking place on the World Wide Web. Here we study the temporal patterns of users tagging activity. We show that the statistical properties of inter-arrival times between subsequent tagging events cannot be explained without taking into account correlation in users behaviors. This shows that social interaction in collaborative tagging communities shapes the evolution of folksonomies. A consensus formation process involving the usage of a small number of tags for a given resources is observed through a numerical and analytical analysis of some well-known folksonomy datasets.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا