No Arabic abstract
Intervals between discrete events representing human activities, as well as other types of events, often obey heavy-tailed distributions, and their impacts on collective dynamics on networks such as contagion processes have been intensively studied. The literature supports that such heavy-tailed distributions are present for inter-event times associated with both individual nodes and individual edges in networks. However, the simultaneous presence of heavy-tailed distributions of inter-event times for nodes and edges is a non-trivial phenomenon, and its origin has been elusive. In the present study, we propose a generative model and its variants to explain this phenomenon. We assume that each node independently transits between a high-activity and low-activity state according to a continuous-time two-state Markov process and that, for the main model, events on an edge occur at a high rate if and only if both end nodes of the edge are in the high-activity state. In other words, two nodes interact frequently only when both nodes prefer to interact with others. The model produces distributions of inter-event times for both individual nodes and edges that resemble heavy-tailed distributions across some scales. It also produces positive correlation in consecutive inter-event times, which is another stylized observation for empirical data of human activity. We expect that our modeling framework provides a useful benchmark for investigating dynamics on temporal networks driven by non-Poissonian event sequences.
A number of human activities exhibit a bursty pattern, namely periods of very high activity that are followed by rest periods. Records of these processes generate time series of events whose inter-event times follow a probability distribution that displays a fat tail. The grounds for such phenomenon are not yet clearly understood. In the present work we use the freely available Wikipedia editing records to unravel some features of this phenomenon. We show that even though the probability to start editing is conditioned by the circadian 24 hour cycle, the conditional probability for the time interval between successive edits at a given time of the day is independent from the latter. We confirm our findings with the activity of posting on the social network Twitter. Our result suggests there is an intrinsic humankind scheduling pattern: after overcoming the encumbrance to start an activity, there is a robust distribution of new related actions, which does not depend on the time of day.
Real-world complex systems often comprise many distinct types of elements as well as many more types of networked interactions between elements. When the relative abundances of types can be measured well, we further observe heavy-tailed categorical distributions for type frequencies. For the comparison of type frequency distributions of two systems or a system with itself at different time points in time -- a facet of allotaxonometry -- a great range of probability divergences are available. Here, we introduce and explore `probability-turbulence divergence, a tunable, straightforward, and interpretable instrument for comparing normalizable categorical frequency distributions. We model probability-turbulence divergence (PTD) after rank-turbulence divergence (RTD). While probability-turbulence divergence is more limited in application than rank-turbulence divergence, it is more sensitive to changes in type frequency. We build allotaxonographs to display probability turbulence, incorporating a way to visually accommodate zero probabilities for `exclusive types which are types that appear in only one system. We explore comparisons of example distributions taken from literature, social media, and ecology. We show how probability-turbulence divergence either explicitly or functionally generalizes many existing kinds of distances and measures, including, as special cases, $L^{(p)}$ norms, the S{o}rensen-Dice coefficient (the $F_1$ statistic), and the Hellinger distance. We discuss similarities with the generalized entropies of R{e}nyi and Tsallis, and the diversity indices (or Hill numbers) from ecology. We close with thoughts on open problems concerning the optimization of the tuning of rank- and probability-turbulence divergence.
We propose and analyze a new estimator of the covariance matrix that admits strong theoretical guarantees under weak assumptions on the underlying distribution, such as existence of moments of only low order. While estimation of covariance matrices corresponding to sub-Gaussian distributions is well-understood, much less in known in the case of heavy-tailed data. As K. Balasubramanian and M. Yuan write, data from real-world experiments oftentimes tend to be corrupted with outliers and/or exhibit heavy tails. In such cases, it is not clear that those covariance matrix estimators .. remain optimal and ..what are the other possible strategies to deal with heavy tailed distributions warrant further studies. We make a step towards answering this question and prove tight deviation inequalities for the proposed estimator that depend only on the parameters controlling the intrinsic dimension associated to the covariance matrix (as opposed to the dimension of the ambient space); in particular, our results are applicable in the case of high-dimensional observations.
We discuss a model of motion of substance through the nodes of a channel of a network. The channel can be modeled by a chain of urns where each urn can exchange substance with the neighboring urns. In addition the urns can exchange substance with the network nodes and the new point is that we include in the model the possibility for exchange of substance among the urns (nodes) and the environment of the network. We consider stationary regime of motion of substance through a finite channel (stationary regime of exchange of substance along the chain of urns) and obtain a class of statistical distributions of substance in the nodes of the channel. Our attention is focused on this class of distributions and we show that for the case of finite channel the obtained class of distributions contains as particular cases truncat
Folksonomies provide a rich source of data to study social patterns taking place on the World Wide Web. Here we study the temporal patterns of users tagging activity. We show that the statistical properties of inter-arrival times between subsequent tagging events cannot be explained without taking into account correlation in users behaviors. This shows that social interaction in collaborative tagging communities shapes the evolution of folksonomies. A consensus formation process involving the usage of a small number of tags for a given resources is observed through a numerical and analytical analysis of some well-known folksonomy datasets.