No Arabic abstract
In this paper, we study systems of distributed entities that can actively modify their communication network. This gives rise to distributed algorithms that apart from communication can also exploit network reconfiguration in order to carry out a given task. At the same time, the distributed task itself may now require global reconfiguration from a given initial network $G_s$ to a target network $G_f$ from a family of networks having some good properties, like small diameter. With reasonably powerful computational entities, there is a straightforward algorithm that transforms any $G_s$ into a spanning clique in $O(log n)$ time. The algorithm can then compute any global function on inputs and reconfigure to any target network in one round. We argue that such a strategy is impractical for real applications. In real dynamic networks there is a cost associated with creating and maintaining connections. To formally capture such costs, we define three edge-complexity measures: the emph{total edge activations}, the emph{maximum activated edges per round}, and the emph{maximum activated degree of a node}. The clique formation strategy highlighted above, maximizes all of them. We aim at improved algorithms that achieve (poly)log$(n)$ time while minimizing the edge-complexity for the general task of transforming any $G_s$ into a $G_f$ of diameter (poly)log$(n)$. We give three distributed algorithms. The first runs in $O(log n)$ time, with at most $2n$ active edges per round, an optimal total of $O(nlog n)$ edge activations, a maximum degree $n-1$, and a target network of diameter 2. The second achieves bounded degree by paying an additional logarithmic factor in time and in total edge activations and gives a target network of diameter $O(log n)$. Our third algorithm shows that if we slightly increase the maximum degree to polylog$(n)$ then we can achieve a running time of $o(log^2 n)$.
Distributed replication systems based on the replicated state machine model have become ubiquitous as the foundation of modern database systems. To ensure availability in the presence of faults, these systems must be able to dynamically replace failed nodes with healthy ones via dynamic reconfiguration. MongoDB is a document oriented database with a distributed replication mechanism derived from the Raft protocol. In this paper, we present MongoRaftReconfig, a novel dynamic reconfiguration protocol for the MongoDB replication system. MongoRaftReconfig utilizes a logless approach to managing configuration state and decouples the processing of configuration changes from the main database operation log. The protocols design was influenced by engineering constraints faced when attempting to redesign an unsafe, legacy reconfiguration mechanism that existed previously in MongoDB. We provide a safety proof of MongoRaftReconfig, along with a formal specification in TLA+. To our knowledge, this is the first published safety proof and formal specification of a reconfiguration protocol for a Raft-based system. We also present results from model checking its safety properties on finite protocol instances. Finally, we discuss the conceptual novelties of MongoRaftReconfig, how it can be understood as an optimized and generalized version of the single server reconfiguration algorithm of Raft, and present an experimental evaluation of how its optimizations can provide performance benefits for reconfigurations.
This invited paper presents some novel ideas on how to enhance the performance of consensus algorithms in distributed wireless sensor networks, when communication costs are considered. Of particular interest are consensus algorithms that exploit the broadcast property of the wireless channel to boost the performance in terms of convergence speeds. To this end, we propose a novel clustering based consensus algorithm that exploits interference for computation, while reducing the energy consumption in the network. The resulting optimization problem is a semidefinite program, which can be solved offline prior to system startup.
This paper shows for the first time that distributed computing can be both reliable and efficient in an environment that is both highly dynamic and hostile. More specifically, we show how to maintain clusters of size $O(log N)$, each containing more than two thirds of honest nodes with high probability, within a system whose size can vary textit{polynomially} with respect to its initial size. Furthermore, the communication cost induced by each node arrival or departure is polylogarithmic with respect to $N$, the maximal size of the system. Our clustering can be achieved despite the presence of a Byzantine adversary controlling a fraction $bad leq {1}{3}-epsilon$ of the nodes, for some fixed constant $epsilon > 0$, independent of $N$. So far, such a clustering could only be performed for systems who size can vary constantly and it was not clear whether that was at all possible for polynomial variances.
We consider the problem of computing an aggregation function in a emph{secure} and emph{scalable} way. Whereas previous distributed solutions with similar security guarantees have a communication cost of $O(n^3)$, we present a distributed protocol that requires only a communication complexity of $O(nlog^3 n)$, which we prove is near-optimal. Our protocol ensures perfect security against a computationally-bounded adversary, tolerates $(1/2-epsilon)n$ malicious nodes for any constant $1/2 > epsilon > 0$ (not depending on $n$), and outputs the exact value of the aggregated function with high probability.
We consider the problem of aggregating data in a dynamic graph, that is, aggregating the data that originates from all nodes in the graph to a specific node, the sink. We are interested in giving lower bounds for this problem, under different kinds of adversaries. In our model, nodes are endowed with unlimited memory and unlimited computational power. Yet, we assume that communications between nodes are carried out with pairwise interactions, where nodes can exchange control information before deciding whether they transmit their data or not, given that each node is allowed to transmit its data at most once. When a node receives a data from a neighbor, the node may aggregate it with its own data. We consider three possible adversaries: the online adaptive adversary, the oblivious adversary , and the randomized adversary that chooses the pairwise interactions uniformly at random. For the online adaptive and the oblivious adversary, we give impossibility results when nodes have no knowledge about the graph and are not aware of the future. Also, we give several tight bounds depending on the knowledge (be it topology related or time related) of the nodes. For the randomized adversary, we show that the Gathering algorithm, which always commands a node to transmit, is optimal if nodes have no knowledge at all. Also, we propose an algorithm called Waiting Greedy, where a node either waits or transmits depending on some parameter, that is optimal when each node knows its future pairwise interactions with the sink.