No Arabic abstract
This doctoral work focuses on three main problems related to social networks: (1) Orchestrating Network Formation: We consider the problem of orchestrating formation of a social network having a certain given topology that may be desirable for the intended usecases. Assuming the social network nodes to be strategic in forming relationships, we derive conditions under which a given topology can be uniquely obtained. We also study the efficiency and robustness of the derived conditions. (2) Multi-phase Influence Maximization: We propose that information diffusion be carried out in multiple phases rather than in a single instalment. With the objective of achieving better diffusion, we discover optimal ways of splitting the available budget among the phases, determining the time delay between consecutive phases, and also finding the individuals to be targeted for initiating the diffusion process. (3) Scalable Preference Aggregation: It is extremely useful to determine a small number of representatives of a social network such that the individual preferences of these nodes, when aggregated, reflect the aggregate preference of the entire network. Using real-world data collected from Facebook with human subjects, we discover a model that faithfully captures the spread of preferences in a social network. We hence propose fast and reliable ways of computing a truly representative aggregate preference of the entire network. In particular, we develop models and methods for solving the above problems, which primarily deal with formation and analysis of social networks.
We present a deterministic model for on-line social networks (OSNs) based on transitivity and local knowledge in social interactions. In the Iterated Local Transitivity (ILT) model, at each time-step and for every existing node $x$, a new node appears which joins to the closed neighbour set of $x.$ The ILT model provably satisfies a number of both local and global properties that were observed in OSNs and other real-world complex networks, such as a densification power law, decreasing average distance, and higher clustering than in random graphs with the same average degree. Experimental studies of social networks demonstrate poor expansion properties as a consequence of the existence of communities with low number of inter-community edges. Bounds on the spectral gap for both the adjacency and normalized Laplacian matrices are proved for graphs arising from the ILT model, indicating such bad expansion properties. The cop and domination number are shown to remain the same as the graph from the initial time-step $G_0$, and the automorphism group of $G_0$ is a subgroup of the automorphism group of graphs generated at all later time-steps. A randomized version of the ILT model is presented, which exhibits a tuneable densification power law exponent, and maintains several properties of the deterministic model.
Although social neuroscience is concerned with understanding how the brain interacts with its social environment, prevailing research in the field has primarily considered the human brain in isolation, deprived of its rich social context. Emerging work in social neuroscience that leverages tools from network analysis has begun to pursue this issue, advancing knowledge of how the human brain influences and is influenced by the structures of its social environment. In this paper, we provide an overview of key theory and methods in network analysis (especially for social systems) as an introduction for social neuroscientists who are interested in relating individual cognition to the structures of an individuals social environments. We also highlight some exciting new work as examples of how to productively use these tools to investigate questions of relevance to social neuroscientists. We include tutorials to help with practical implementation of the concepts that we discuss. We conclude by highlighting a broad range of exciting research opportunities for social neuroscientists who are interested in using network analysis to study social systems.
Here, we review the research we have done on social contagion. We describe the methods we have employed (and the assumptions they have entailed) in order to examine several datasets with complementary strengths and weaknesses, including the Framingham Heart Study, the National Longitudinal Study of Adolescent Health, and other observational and experimental datasets that we and others have collected. We describe the regularities that led us to propose that human social networks may exhibit a three degrees of influence property, and we review statistical approaches we have used to characterize inter-personal influence with respect to phenomena as diverse as obesity, smoking, cooperation, and happiness. We do not claim that this work is the final word, but we do believe that it provides some novel, informative, and stimulating evidence regarding social contagion in longitudinally followed networks. Along with other scholars, we are working to develop new methods for identifying causal effects using social network data, and we believe that this area is ripe for statistical development as current methods have known and often unavoidable limitations.
The ability to share social network data at the level of individual connections is beneficial to science: not only for reproducing results, but also for researchers who may wish to use it for purposes not foreseen by the data releaser. Sharing such data, however, can lead to serious privacy issues, because individuals could be re-identified, not only based on possible nodes attributes, but also from the structure of the network around them. The risk associated with re-identification can be measured and it is more serious in some networks than in others. Various optimization algorithms have been proposed to anonymize the network while keeping the number of changes minimal. However, existing algorithms do not provide guarantees on where the changes will be made, making it difficult to quantify their effect on various measures. Using network models and real data, we show that the average degree of networks is a crucial parameter for the severity of re-identification risk from nodes neighborhoods. Dense networks are more at risk, and, apart from a small band of average degree values, either almost all nodes are re-identifiable or they are all safe. Our results allow researchers to assess the privacy risk based on a small number of network statistics which are available even before the data is collected. As a rule-of-thumb, the privacy risks are high if the average degree is above 10. Guided by these results we propose a simple method based on edge sampling to mitigate the re-identification risk of nodes. Our method can be implemented already at the data collection phase. Its effect on various network measures can be estimated and corrected using sampling theory. These properties are in contrast with previous methods arbitrarily biasing the data. In this sense, our work could help in sharing network data in a statistically tractable way.
We propose a stochastic model for the diffusion of topics entering a social network modeled by a Watts-Strogatz graph. Our model sets into play an implicit competition between these topics as they vie for the attention of users in the network. The dynamics of our model are based on notions taken from real-world OSNs like Twitter where users either adopt an exogenous topic or copy topics from their neighbors leading to endogenous propagation. When instantiated correctly, the model achieves a viral regime where a few topics garner unusually good response from the network, closely mimicking the behavior of real-world OSNs. Our main contribution is our description of how clusters of proximate users that have spoken on the topic merge to form a large giant component making a topic go viral. This demonstrates that it is not weak ties but actually strong ties that play a major part in virality. We further validate our model and our hypotheses about its behavior by comparing our simulation results with the results of a measurement study conducted on real data taken from Twitter.