No Arabic abstract
Smoothed analysis is a framework suggested for mediating gaps between worst-case and average-case complexities. In a recent work, Dinitz et al.~[Distributed Computing, 2018] suggested to use smoothed analysis in order to study dynamic networks. Their aim was to explain the gaps between real-world networks that function well despite being dynamic, and the strong theoretical lower bounds for arbitrary networks. To this end, they introduced a basic model of smoothing in dynamic networks, where an adversary picks a sequence of graphs, representing the topology of the network over time, and then each of these graphs is slightly perturbed in a random manner. The model suggested above is based on a per-round noise, and our aim in this work is to extend it to models of noise more suited for multiple rounds. This is motivated by long-lived networks, where the amount and location of noise may vary over time. To this end, we present several different models of noise. First, we extend the previous model to cases where the amount of noise is very small. Then, we move to more refined models, where the amount of noise can change between different rounds, e.g., as a function of the number of changes the network undergoes. We also study a model where the noise is not arbitrarily spread among the network, but focuses in each round in the areas where changes have occurred. Finally, we study the power of an adaptive adversary, who can choose its actions in accordance with the changes that have occurred so far. We use the flooding problem as a running case-study, presenting very different behaviors under the different models of noise, and analyze the flooding time in different models.
We investigate a special case of hereditary property that we refer to as {em robustness}. A property is {em robust} in a given graph if it is inherited by all connected spanning subgraphs of this graph. We motivate this definition in different contexts, showing that it plays a central role in highly dynamic networks, although the problem is defined in terms of classical (static) graph theory. In this paper, we focus on the robustness of {em maximal independent sets} (MIS). Following the above definition, a MIS is said to be {em robust} (RMIS) if it remains a valid MIS in all connected spanning subgraphs of the original graph. We characterize the class of graphs in which {em all} possible MISs are robust. We show that, in these particular graphs, the problem of finding a robust MIS is {em local}; that is, we present an RMIS algorithm using only a sublogarithmic number of rounds (in the number of nodes $n$) in the ${cal LOCAL}$ model. On the negative side, we show that, in general graphs, the problem is not local. Precisely, we prove a $Omega(n)$ lower bound on the number of rounds required for the nodes to decide consistently in some graphs. This result implies a separation between the RMIS problem and the MIS problem in general graphs. It also implies that any strategy in this case is asymptotically (in order) as bad as collecting all the network information at one node and solving the problem in a centralized manner. Motivated by this observation, we present a centralized algorithm that computes a robust MIS in a given graph, if one exists, and rejects otherwise. Significantly, this algorithm requires only a polynomial amount of local computation time, despite the fact that exponentially many MISs and exponentially many connected spanning subgraphs may exist.
We study the problem of clock synchronization in highly dynamic networks, where communication links can appear or disappear at any time. The nodes in the network are equipped with hardware clocks, but the rate of the hardware clocks can vary arbitrarily within specific bounds, and the estimates that nodes can obtain about the clock values of other nodes are inherently inaccurate. Our goal in this setting is to output a logical clock at each node such that the logical clocks of any two nodes are not too far apart, and nodes that remain close to each other in the network for a long time are better synchronized than distant nodes. This property is called gradient clock synchronization. Gradient clock synchronization has been widely studied in the static setting, where the network topology does not change. We show that the asymptotically optimal bounds obtained for the static case also apply to our highly dynamic setting: if two nodes remain at distance $d$ from each other for sufficiently long, it is possible to upper bound the difference between their clock values by $O(d log (D / d))$, where $D$ is the diameter of the network. This is known to be optimal even for static networks. Furthermore, we show that our algorithm has optimal stabilization time: when a path of length $d$ appears between two nodes, the time required until the clock skew between the two nodes is reduced to $O(d log (D / d))$ is $O(D)$, which we prove to be optimal. Finally, the techniques employed for the more intricate analysis of the algorithm for dynamic graphs provide additional insights that are also of interest for the static setting. In particular, we establish self-stabilization of the gradient property within $O(D)$ time.
We propose a group membership service for dynamic ad hoc networks. It maintains as long as possible the existing groups and ensures that each group diameter is always smaller than a constant, fixed according to the application using the groups. The proposed protocol is self-stabilizing and works in dynamic distributed systems. Moreover, it ensures a kind of continuity in the service offer to the application while the system is converging, except if too strong topology changes happen. Such a best effort behavior allows applications to rely on the groups while the stabilization has not been reached, which is very useful in dynamic ad hoc networks.
In the load balancing problem, each node in a network is assigned a load, and the goal is to equally distribute the loads among the nodes, by preforming local load exchanges. While load balancing was extensively studied in static networks, only recently a load balancing algorithm for dynamic networks with a bounded convergence time was presented. In this paper, we further study the time complexity of load balancing in the context of dynamic networks. First, we show that randomness is not necessary, and present a deterministic algorithm which slightly improves the running time of the previous algorithm, at the price of not being matching-based. Then, we consider integral loads, i.e., loads that cannot be split indefinitely, and prove that no matching-based algorithm can have a bounded convergence time for this case. To circumvent both this impossibility result, and a known one for the non-integral case, we apply the method of smoothed analysis, where random perturbations are made over the worst-case choices of network topologies. We show both impossibility results do not hold under this kind of analysis, suggesting that load-balancing in real world systems might be faster than the lower bounds suggest.
We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of our knowledge, this is the first scaling study on dynamic GNN. We devise mechanisms for reducing the GPU memory usage and identify two execution time bottlenecks: CPU-GPU data transfer; and communication volume. Exploiting properties of dynamic graphs, we design a graph difference-based strategy to significantly reduce the transfer time. We develop a simple, but effective data distribution technique under which the communication volume remains fixed and linear in the input size, for any number of GPUs. Our experiments using billion-size graphs on a system of 128 GPUs shows that: (i) the distribution scheme achieves up to 30x speedup on 128 GPUs; (ii) the graph-difference technique reduces the transfer time by a factor of up to 4.1x and the overall execution time by up to 40%