No Arabic abstract
Wikipedia is a free Internet encyclopedia with an enormous amount of content. This encyclopedia is written by volunteers with various backgrounds in a collective fashion; anyone can access and edit most of the articles. This open-editing nature may give us prejudice that Wikipedia is an unstable and unreliable source; yet many studies suggest that Wikipedia is even more accurate and self-consistent than traditional encyclopedias. Scholars have attempted to understand such extraordinary credibility, but usually used the number of edits as the unit of time, without consideration of real time. In this work, we probe the formation of such collective intelligence through a systematic analysis using the entire history of 34,534,110 English Wikipedia articles, between 2001 and 2014. From this massive data set, we observe the universality of both timewise and lengthwise editing scales, which suggests that it is essential to consider the real-time dynamics. By considering real time, we find the existence of distinct growth patterns that are unobserved by utilizing the number of edits as the unit of time. To account for these results, we present a mechanistic model that adopts the article editing dynamics based on both editor-editor and editor-article interactions. The model successfully generates the key properties of real Wikipedia articles such as distinct types of articles for the editing patterns characterized by the interrelationship between the numbers of edits and editors, and the article size. In addition, the model indicates that infrequently referred articles tend to grow faster than frequently referred ones, and articles attracting a high motivation to edit counterintuitively reduce the number of participants. We suggest that this decay of participants eventually brings inequality among the editors, which will become more severe with time.
A number of human activities exhibit a bursty pattern, namely periods of very high activity that are followed by rest periods. Records of this process generate time series of events whose inter-event times follow a probability distribution that displays a fat tail. The grounds for such phenomenon are not yet clearly understood. In the present work we use the freely available Wikipedias editing records to tackle this question by measuring the level of burstiness, as well as the memory effect of the editing tasks performed by different editors in different pages. Our main finding is that, even though the editing activity is conditioned by the circadian 24 hour cycle, the conditional probability of an activity of a given duration at a given time of the day is independent from the latter. This suggests that the human activity seems to be related to the high cost of starting an action as opposed to the much lower cost of continuing that action.
It is generally accepted that neighboring nodes in financial networks are negatively assorted with respect to the correlation between their degrees. This feature would play an important damping role in the market during downturns (periods of distress) since this connectivity pattern between firms lowers the chances of auto-amplifying (the propagation of) distress. In this paper we explore a trade-network of industrial firms where the nodes are suppliers or buyers, and the links are those invoices that the suppliers send out to their buyers and then go on to present to their bank for discounting. The network was collected by a large Italian bank in 2007, from their intermediation of the sales on credit made by their clients. The network also shows dissortative behavior as seen in other studies on financial networks. However, when looking at the credit rating of the firms, an important attribute internal to each node, we find that firms that trade with one another share overwhelming similarity. We know that much data is missing from our data set. However, we can quantify the amount of missing data using information exposure, a variable that connects social structure and behavior. This variable is a ratio of the sales invoices that a supplier presents to their bank over their total sales. Results reveal a non-trivial and robust relationship between the information exposure and credit rating of a firm, indicating the influence of the neighbors on a firms rating. This methodology provides a new insight into how to reconstruct a network suffering from incomplete information.
We perform an in-depth analysis on the inequality in 863 Wikimedia projects. We take the complete editing history of 267,304,095 Wikimedia items until 2016, which not only covers every language edition of Wikipedia, but also embraces the comple
In their recent work Scale-free networks are rare, Broido and Clauset address the problem of the analysis of degree distributions in networks to classify them as scale-free at different strengths of scale-freeness. Over the last two decades, a multitude of papers in network science have reported that the degree distributions in many real-world networks follow power laws. Such networks were then referred to as scale-free. However, due to a lack of a precise definition, the term has evolved to mean a range of different things, leading to confusion and contradictory claims regarding scale-freeness of a given network. Recognizing this problem, the authors of Scale-free networks are rare try to fix it. They attempt to develop a versatile and statistically principled approach to remove this scale-free ambiguity accumulated in network science literature. Although their paper presents a fair attempt to address this fundamental problem, we must bring attention to some important issues in it.
A model for the probabilistic function followed in Wikipedia edition is presented and compared with simulations and real data. It is argued that the probability to edit is proportional to the editors number of previous editions (preferential attachment), to the editors fitness and to an ageing factor. Using these simple ingredients, it is possible to reproduce the results obtained for Wikipedia edition dynamics for a collection of single pages as well as the averaged results. Using a stochastic process framework, a recursive equation was obtained for the average of the number of editions per editor that seems to describe the editing behaviour in Wikipedia.