Research papers, master and doctoral theses published by Manuel Cebrian

Social media fingerprints of unemployment

124 - Alejandro Llorente , Manuel Garcia-Herranz , Manuel Cebrian 2014

Recent wide-spread adoption of electronic and pervasive technologies has enabled the study of human behavior at an unprecedented level, uncovering universal patterns underlying human activity, mobility, and inter-personal communication. In the present work, we investigate whether deviations from these universal patterns may reveal information about the socio-economical status of geographical regions. We quantify the extent to which deviations in diurnal rhythm, mobility patterns, and communication styles across regions relate to their unemployment incidence. For this we examine a country-scale publicly articulated social media dataset, where we quantify individual behavioral features from over 145 million geo-located messages distributed among more than 340 different Spanish economic regions, inferred by computing communities of cohesive mobility fluxes. We find that regions exhibiting more diverse mobility fluxes, earlier diurnal rhythms, and more correct grammatical styles display lower unemployment rates. As a result, we provide a simple model able to produce accurate, easily interpretable reconstruction of regional unemployment incidence from their social-media digital fingerprints alone. Our results show that cost-effective economical indicators can be built based on publicly-available social media datasets.

Physics and Society Social and Information Networks Data Analysis Statistics and Probability

Using Friends as Sensors to Detect Global-Scale Contagious Outbreaks

107 - Manuel Garcia-Herranz , Esteban Moro Egido , Manuel Cebrian 2012

Recent research has focused on the monitoring of global-scale online data for improved detection of epidemics, mood patterns, movements in the stock market, political revolutions, box-office revenues, consumer behaviour and many other important phenomena. However, privacy considerations and the sheer scale of data available online are quickly making global monitoring infeasible, and existing methods do not take full advantage of local network structure to identify key nodes for monitoring. Here, we develop a model of the contagious spread of information in a global-scale, publicly-articulated social network and show that a simple method can yield not just early detection, but advance warning of contagious outbreaks. In this method, we randomly choose a small fraction of nodes in the network and then we randomly choose a friend of each node to include in a group for local monitoring. Using six months of data from most of the full Twittersphere, we show that this friend group is more central in the network and it helps us to detect viral outbreaks of the use of novel hashtags about 7 days earlier than we could with an equal-sized randomly chosen group. Moreover, the method actually works better than expected due to network structure alone because highly central actors are both more active and exhibit increased diversity in the information they transmit to others. These results suggest that local monitoring is not just more efficient, it is more effective, and it is possible that other contagious processes in global-scale networks may be similarly monitored.

Social and Information Networks Physics and Society

Experimental study of the impact of historical information in human coordination

174 - Manuel Cebrian , Ramamohan Paturi , Daniel Ricketts 2012

We perform laboratory experiments to elucidate the role of historical information in games involving human coordination. Our approach follows prior work studying human network coordination using the task of graph coloring. We first motivate this research by showing empirical evidence that the resolution of coloring conflicts is dependent upon the recent local history of that conflict. We also conduct two tailored experiments to manipulate the game history that can be used by humans in order to determine (i) whether humans use historical information, and (ii) whether they use it effectively. In the first variant, during the course of each coloring task, the network positions of the subjects were periodically swapped while maintaining the global coloring state of the network. In the second variant, participants completed a series of 2-coloring tasks, some of which were restarts from checkpoints of previous tasks. Thus, the participants restarted the coloring task from a point in the middle of a previous task without knowledge of the history that led to that point. We report on the game dynamics and average completion times for the diverse graph topologies used in the swap and restart experiments.

Social and Information Networks Physics and Society

The Weakness of Weak Ties in the Classroom

69 - Luis M. Vaquero , Manuel Cebrian 2012

Granovetters strength of weak ties hypothesizes that isolated social ties offer limited access to external prospects, while heterogeneous social ties diversify ones opportunities. We analyze the most complete record of college student interactions to date (approximately 80,000 interactions by 290 students -- 16 times more interactions with almost 3 times more students than previous studies on educational networks) and compare the social interaction data with the academic scores of the students. Our first finding is that social diversity is negatively correlated with performance. This is explained by our second finding: highly performing students interact in groups of similarly performing peers. This effect is stronger the higher the student performance is. Indeed, low performance students tend to initiate many transient interactions independently of the performance of their target. In other words, low performing students act disassortatively with respect to their social network, whereas high scoring students act assortatively. Our data also reveals that highly performing students establish persistent interactions before mid and low performing ones and that they use more structured and longer cascades of information from which low performing students are excluded.

Social and Information Networks Physics and Society

Overcoming Problems in the Measurement of Biological Complexity

49 - Manuel Cebrian , Manuel Alfonseca , 2010

In a genetic algorithm, fluctuations of the entropy of a genome over time are interpreted as fluctuations of the information that the genomes organism is storing about its environment, being this reflected in more complex organisms. The computation of this entropy presents technical problems due to the small population sizes used in practice. In this work we propose and test an alternative way of measuring the entropy variation in a population by means of algorithmic information theory, where the entropy variation between two generational steps is the Kolmogorov complexity of the first step conditioned to the second one. As an example application of this technique, we report experimental differences in entropy evolution between systems in which sexual reproduction is present or absent.

Computational Engineering Neural and Evolutionary Computing Adaptation and Self-Organizing Systems

Grammatical Evolution with Restarts for Fast Fractal Generation

147 - Manuel Cebrian , Manuel Alfonseca , Alfonso Ortega 2010

In a previous work, the authors proposed a Grammatical Evolution algorithm to automatically generate Lindenmayer Systems which represent fractal curves with a pre-determined fractal dimension. This paper gives strong statistical evidence that the probability distributions of the execution time of that algorithm exhibits a heavy tail with an hyperbolic probability decay for long executions, which explains the erratic performance of different executions of the algorithm. Three different restart strategies have been incorporated in the algorithm to mitigate the problems associated to heavy tail distributions: the first assumes full knowledge of the execution time probability distribution, the second and third assume no knowledge. These strategies exploit the fact that the probability of finding a solution in short executions is non-negligible and yield a severe reduction, both in the expected execution time (up to one order of magnitude) and in its variance, which is reduced from an infinite to a finite value.

Neural and Evolutionary Computing Symbolic Computation

Modeling Dynamical Influence in Human Interaction Patterns

105 - Wei Pan , Manuel Cebrian , Wen Dong 2010

How can we model influence between individuals in a social system, even when the network of interactions is unknown? In this article, we review the literature on the influence model, which utilizes independent time series to estimate how much the state of one actor affects the state of another actor in the system. We extend this model to incorporate dynamical parameters that allow us to infer how influence changes over time, and we provide three examples of how this model can be applied to simulated and real data. The results show that the model can recover known estimates of influence, it generates results that are consistent with other measures of social networks, and it allows us to uncover important shifts in the way states may be transmitted between actors at different points in time.

Social and Information Networks Physics and Society

Modeling Corporate Epidemiology

83 - Benjamin Waber , Ellen Pollock , Manuel Cebrian 2010

Corporate responses to illness is currently an ad-hoc, subjective process that has little basis in data on how disease actually spreads at the workplace. Additionally, many studies have shown that productivity is not an individual factor but a social one: in any study on epidemic responses this social factor has to be taken into account. The barrier to addressing this problem has been the lack of data on the interaction and mobility patterns of people in the workplace. We have created a wearable Sociometric Badge that senses interactions between individuals using an infra-red (IR) transceiver and proximity using a radio transmitter. Using the data from the Sociometric Badges, we are able to simulate diseases spreading through face-to-face interactions with realistic epidemiological parameters. In this paper we construct a curve trading off productivity with epidemic potential. We are able to take into account impacts on productivity that arise from social factors, such as interaction diversity and density, which studies that take an individual approach ignore. We also propose new organizational responses to diseases that take into account behavioral patterns that are associated with a more virulent disease spread. This is advantageous because it will allow companies to decide appropriate responses based on the organizational context of a disease outbreak.

Computers and Society Social and Information Networks

Evaluating the Impact of Information Distortion on Normalized Compression Distance

187 - Ana Granados , Manuel Cebrian , David Camacho 2008

In this paper we apply different techniques of information distortion on a set of classical books written in English. We study the impact that these distortions have upon the Kolmogorov complexity and the clustering by compression technique (the latter based on Normalized Compression Distance, NCD). We show how to decrease the complexity of the considered books introducing several modifications in them. We measure how the information contained in each book is maintained using a clustering error measure. We find experimentally that the best way to keep the clustering error is by means of modifications in the most frequent words. We explain the details of these information distortions and we compare with other kinds of modifications like random word distortions and unfrequent word distortions. Finally, some phenomenological explanations from the different empirical results that have been carried out are presented.

Information Theory Information Theory

Exploiting Heavy Tails in Training Times of Multilayer Perceptrons: A Case Study with the UCI Thyroid Disease Database

35 - Manuel Cebrian , Ivan Cantador 2007

The random initialization of weights of a multilayer perceptron makes it possible to model its training process as a Las Vegas algorithm, i.e. a randomized algorithm which stops when some required training error is obtained, and whose execution time is a random variable. This modeling is used to perform a case study on a well-known pattern recognition benchmark: the UCI Thyroid Disease Database. Empirical evidence is presented of the training time probability distribution exhibiting a heavy tail behavior, meaning a big probability mass of long executions. This fact is exploited to reduce the training time cost by applying two simple restart strategies. The first assumes full knowledge of the distribution yielding a 40% cut down in expected time with respect to the training without restarts. The second, assumes null knowledge, yielding a reduction ranging from 9% to 23%.

Neural and Evolutionary Computing

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد