No Arabic abstract
We present a deterministic model for on-line social networks (OSNs) based on transitivity and local knowledge in social interactions. In the Iterated Local Transitivity (ILT) model, at each time-step and for every existing node $x$, a new node appears which joins to the closed neighbour set of $x.$ The ILT model provably satisfies a number of both local and global properties that were observed in OSNs and other real-world complex networks, such as a densification power law, decreasing average distance, and higher clustering than in random graphs with the same average degree. Experimental studies of social networks demonstrate poor expansion properties as a consequence of the existence of communities with low number of inter-community edges. Bounds on the spectral gap for both the adjacency and normalized Laplacian matrices are proved for graphs arising from the ILT model, indicating such bad expansion properties. The cop and domination number are shown to remain the same as the graph from the initial time-step $G_0$, and the automorphism group of $G_0$ is a subgroup of the automorphism group of graphs generated at all later time-steps. A randomized version of the ILT model is presented, which exhibits a tuneable densification power law exponent, and maintains several properties of the deterministic model.
This doctoral work focuses on three main problems related to social networks: (1) Orchestrating Network Formation: We consider the problem of orchestrating formation of a social network having a certain given topology that may be desirable for the intended usecases. Assuming the social network nodes to be strategic in forming relationships, we derive conditions under which a given topology can be uniquely obtained. We also study the efficiency and robustness of the derived conditions. (2) Multi-phase Influence Maximization: We propose that information diffusion be carried out in multiple phases rather than in a single instalment. With the objective of achieving better diffusion, we discover optimal ways of splitting the available budget among the phases, determining the time delay between consecutive phases, and also finding the individuals to be targeted for initiating the diffusion process. (3) Scalable Preference Aggregation: It is extremely useful to determine a small number of representatives of a social network such that the individual preferences of these nodes, when aggregated, reflect the aggregate preference of the entire network. Using real-world data collected from Facebook with human subjects, we discover a model that faithfully captures the spread of preferences in a social network. We hence propose fast and reliable ways of computing a truly representative aggregate preference of the entire network. In particular, we develop models and methods for solving the above problems, which primarily deal with formation and analysis of social networks.
Human decision making underlies data generating process in multiple application areas, and models explaining and predicting choices made by individuals are in high demand. Discrete choice models are widely studied in economics and computational social sciences. As digital social networking facilitates information flow and spread of influence between individuals, new advances in modeling are needed to incorporate social information into these models in addition to characteristic features affecting individual choices. In this paper, we propose two novel models with scalable training algorithms: local logistics graph regularization (LLGR) and latent class graph regularization (LCGR) models. We add social regularization to represent similarity between friends, and we introduce latent classes to account for possible preference discrepancies between different social groups. Training of the LLGR model is performed using alternating direction method of multipliers (ADMM), and training of the LCGR model is performed using a specialized Monte Carlo expectation maximization (MCEM) algorithm. Scalability to large graphs is achieved by parallelizing computation in both the expectation and the maximization steps. The LCGR model is the first latent class classification model that incorporates social relationships among individuals represented by a given graph. To evaluate our two models, we consider three classes of data to illustrate a typical large-scale use case in internet and social media applications. We experiment on synthetic datasets to empirically explain when the proposed model is better than vanilla classification models that do not exploit graph structure. We also experiment on real-world data, including both small scale and large scale real-world datasets, to demonstrate on which types of datasets our model can be expected to outperform state-of-the-art models.
In-depth studies of sociotechnical systems are largely limited to single instances. Network surveys are expensive, and platforms vary in important ways, from interface design, to social norms, to historical contingencies. With single examples, we can not in general know how much of observed network structure is explained by historical accidents, random noise, or meaningful social processes, nor can we claim that network structure predicts outcomes, such as organization success or ecosystem health. Here, I show how we can adopt a comparative approach for settings where we have, or can cleverly construct, multiple instances of a network to estimate the natural variability in social systems. The comparative approach makes previously untested theories testable. Drawing on examples from the social networks literature, I discuss emerging directions in the study of populations of sociotechnical systems using insights from organization theory and ecology.
We introduce a new threshold model of social networks, in which the nodes influenced by their neighbours can adopt one out of several alternatives. We characterize social networks for which adoption of a product by the whole network is possible (respectively necessary) and the ones for which a unique outcome is guaranteed. These characterizations directly yield polynomial time algorithms that allow us to determine whether a given social network satisfies one of the above properties. We also study algorithmic questions for networks without unique outcomes. We show that the problem of determining whether a final network exists in which all nodes adopted some product is NP-complete. In turn, the problems of determining whether a given node adopts some (respectively, a given) product in some (respectively, all) network(s) are either co-NP complete or can be solved in polynomial time. Further, we show that the problem of computing the minimum possible spread of a product is NP-hard to approximate with an approximation ratio better than $Omega(n)$, in contrast to the maximum spread, which is efficiently computable. Finally, we clarify that some of the above problems can be solved in polynomial time when there are only two products.
We investigate the impact of noise and topology on opinion diversity in social networks. We do so by extending well-established models of opinion dynamics to a stochastic setting where agents are subject both to assimilative forces by their local social interactions, as well as to idiosyncratic factors preventing their population from reaching consensus. We model the latter to account for both scenarios where noise is entirely exogenous to peer influence and cases where it is instead endogenous, arising from the agents desire to maintain some uniqueness in their opinions. We derive a general analytical expression for opinion diversity, which holds for any network and depends on the networks topology through its spectral properties alone. Using this expression, we find that opinion diversity decreases as communities and clusters are broken down. We test our predictions against data describing empirical influence networks between major news outlets and find that incorporating our measure in linear models for the sentiment expressed by such sources on a variety of topics yields a notable improvement in terms of explanatory power.