No Arabic abstract
Dynamic networks, also called network streams, are an important data representation that applies to many real-world domains. Many sets of network data such as e-mail networks, social networks, or internet traffic networks are best represented by a dynamic network due to the temporal component of the data. One important application in the domain of dynamic network analysis is anomaly detection. Here the task is to identify points in time where the network exhibits behavior radically different from a typical time, either due to some event (like the failure of machines in a computer network) or a shift in the network properties. This problem is made more difficult by the fluid nature of what is considered normal network behavior. The volume of traffic on a network, for example, can change over the course of a month or even vary based on the time of the day without being considered unusual. Anomaly detection tests using traditional network statistics have difficulty in these scenarios due to their Density Dependence: as the volume of edges changes the value of the statistics changes as well making it difficult to determine if the change in signal is due to the traffic volume or due to some fundamental shift in the behavior of the network. To more accurately detect anomalies in dynamic networks, we introduce the concept of Density-Consistent network statistics. On synthetically generated graphs anomaly detectors using these statistics show a a 20-400% improvement in the recall when distinguishing graphs drawn from different distributions. When applied to several real datasets Density-Consistent statistics recover multiple network events which standard statistics failed to find.
Many social and economic systems can be represented as attributed networks encoding the relations between entities who are themselves described by different node attributes. Finding anomalies in these systems is crucial for detecting abuses such as credit card frauds, web spams or network intrusions. Intuitively, anomalous nodes are defined as nodes whose attributes differ starkly from the attributes of a certain set of nodes of reference, called the context of the anomaly. While some methods have proposed to spot anomalies locally, globally or within a community context, the problem remain challenging due to the multi-scale composition of real networks and the heterogeneity of node metadata. Here, we propose a principled way to uncover outlier nodes simultaneously with the context with respect to which they are anomalous, at all relevant scales of the network. We characterize anomalous nodes in terms of the concentration retained for each node after smoothing specific signals localized on the vertices of the graph. Besides, we introduce a graph signal processing formulation of the Markov stability framework used in community detection, in order to find the context of anomalies. The performance of our method is assessed on synthetic and real-world attributed networks and shows superior results concerning state of the art algorithms. Finally, we show the scalability of our approach in large networks employing Chebychev polynomial approximations.
An important task in network analysis is the detection of anomalous events in a network time series. These events could merely be times of interest in the network timeline or they could be examples of malicious activity or network malfunction. Hypothesis testing using network statistics to summarize the behavior of the network provides a robust framework for the anomaly detection decision process. Unfortunately, choosing network statistics that are dependent on confounding factors like the total number of nodes or edges can lead to incorrect conclusions (e.g., false positives and false negatives). In this dissertation we describe the challenges that face anomaly detection in dynamic network streams regarding confounding factors. We also provide two solutions to avoiding error due to confounding factors: the first is a randomization testing method that controls for confounding factors, and the second is a set of size-consistent network statistics which avoid confounding due to the most common factors, edge count and node count.
Competition networks are formed via adversarial interactions between actors. The Dynamic Competition Hypothesis predicts that influential actors in competition networks should have a large number of common out-neighbors with many other nodes. We empirically study this idea as a centrality score and find the measure predictive of importance in several real-world networks including food webs, conflict networks, and voting data from Survivor.
There is an ever-increasing interest in investigating dynamics in time-varying graphs (TVGs). Nevertheless, so far, the notion of centrality in TVG scenarios usually refers to metrics that assess the relative importance of nodes along the temporal evolution of the dynamic complex network. For some TVG scenarios, however, more important than identifying the central nodes under a given node centrality definition is identifying the key time instants for taking certain actions. In this paper, we thus introduce and investigate the notion of time centrality in TVGs. Analogously to node centrality, time centrality evaluates the relative importance of time instants in dynamic complex networks. In this context, we present two time centrality metrics related to diffusion processes. We evaluate the two defined metrics using both a real-world dataset representing an in-person contact dynamic network and a synthetically generated randomized TVG. We validate the concept of time centrality showing that diffusion starting at the best classified time instants (i.e. the most central ones), according to our metrics, can perform a faster and more efficient diffusion process.
In this paper, we analyze dynamic switching networks, wherein the networks switch arbitrarily among a set of topologies. For this class of dynamic networks, we derive an epidemic threshold, considering the SIS epidemic model. First, an epidemic probabilistic model is developed assuming independence between states of nodes. We identify the conditions under which the epidemic dies out by linearizing the underlying dynamical system and analyzing its asymptotic stability around the origin. The concept of joint spectral radius is then used to derive the epidemic threshold, which is later validated using several networks (Watts-Strogatz, Barabasi-Albert, MIT reality mining graphs, Regular, and Gilbert). A simplified version of the epidemic threshold is proposed for undirected networks. Moreover, in the case of static networks, the derived epidemic threshold is shown to match conventional analytical results. Then, analytical results for the epidemic threshold of dynamic networksare proved to be applicable to periodic networks. For dynamic regular networks, we demonstrate that the epidemic threshold is identical to the epidemic threshold for static regular networks. An upper bound for the epidemic spread probability in dynamic Gilbert networks is also derived and verified using simulation.