No Arabic abstract
Ensuring reliable communication despite possibly malicious participants is a primary objective in any distributed system or network. In this paper, we investigate the possibility of reliable broadcast in a dynamic network whose topology may evolve while the broadcast is in progress. In particular, we adapt the Certified Propagation Algorithm (CPA) to make it work on dynamic networks and we present conditions (on the underlying dynamic graph) to enable safety and liveness properties of the reliable broadcast. We furthermore explore the complexity of assessing these conditions for various classes of dynamic networks.
In this paper, we consider the Byzantine reliable broadcast problem on authenticated and partially connected networks. The state-of-the-art method to solve this problem consists in combining two algorithms from the literature. Handling asynchrony and faulty senders is typically done thanks to Gabriel Brachas authenticated double-echo broadcast protocol, which assumes an asynchronous fully connected network. Danny Dolevs algorithm can then be used to provide reliable communications between processes in the global fault model, where up to f processes among N can be faulty in a communication network that is at least 2f+1-connected. Following recent works that showed that Dolevs protocol can be made more practical thanks to several optimizations, we show that the state-of-the-art methods to solve our problem can be optimized thanks to layer-specific and cross-layer optimizations. Our simulations with the Omnet++ network simulator show that these optimizations can be efficiently combined to decrease the total amount of information transmitted or the protocols latency (e.g., respectively, -25% and -50% with a 16B payload, N=31 and f=4) compared to the state-of-the-art combination of Brachas and Dolevs protocols.
We revisit Byzantine tolerant reliable broadcast with honest dealer algorithms in multi-hop networks. To tolerate Byzantine faulty nodes arbitrarily spread over the network, previous solutions require a factorial number of messages to be sent over the network if the messages are not authenticated (e.g. digital signatures are not available). We propose modifications that preserve the safety and liveness properties of the original unauthenticated protocols, while highly decreasing their observed message complexity when simulated on several classes of graph topologies, potentially opening to their employment.
The Reliable Broadcast concept allows an honest party to send a message to all other parties and to make sure that all honest parties receive this message. In addition, it allows an honest party that received a message to know that all other honest parties would also receive the same message. This technique is important to ensure distributed consistency when facing failures. In the current paper, we study the ability to use RR to consistently transmit a sequence of input values in an asynchronous environment with a designated sender. The task can be easily achieved using counters, but cannot be achieved with a bounded memory facing failures. We weaken the problem and ask whether the receivers can at least share a common suffix. We prove that in a standard (lossless) asynchronous system no bounded memory protocol can guarantee a common suffix at all receivers for every input sequence if a single party might crash. We further study the problem facing transient faults and prove that when limiting the problem to transmitting a stream of a single value being sent repeatedly we show a bounded memory self-stabilizing protocol that can ensure a common suffix even in the presence of transient faults and an arbitrary number of crash faults. We further prove that this last problem is not solvable in the presence of a single Byzantine fault. Thus, this problem {bf separates} Byzantine behavior from crash faults in an asynchronous environment.
The Byzantine agreement problem requires a set of $n$ processes to agree on a value sent by a transmitter, despite a subset of $b$ processes behaving in an arbitrary, i.e. Byzantine, manner and sending corrupted messages to all processes in the system. It is well known that the problem has a solution in a (an eventually) synchronous message passing distributed system iff the number of processes in the Byzantine subset is less than one third of the total number of processes, i.e. iff $n > 3b+1$. The rest of the processes are expected to be correct: they should never deviate from the algorithm assigned to them and send corrupted messages. But what if they still do? We show in this paper that it is possible to solve Byzantine agreement even if, beyond the $ b$ ($< n/3 $) Byzantine processes, some of the other processes also send corrupted messages, as long as they do not send them to all. More specifically, we generalize the classical Byzantine model and consider that Byzantine failures might be partial. In each communication step, some of the processes might send corrupted messages to a subset of the processes. This subset of processes - to which corrupted messages might be sent - could change over time. We compute the exact number of processes that can commit such faults, besides those that commit classical Byzantine failures, while still solving Byzantine agreement. We present a corresponding Byzantine agreement algorithm and prove its optimality by giving resilience and complexity bounds.
The notion of knowledge-based program introduced by Halpern and Fagin provides a useful formalism for designing, analysing, and optimising distributed systems. This paper formulates the two phase commit protocol as a knowledge-based program and then an iterative process of model checking and counter-example guided refinement is followed to find concrete implementations of the program for the case of perfect recall semantic in the Byzantine failures context with synchronous reliable communication. We model several different kinds of Byzantine failures and verify different strategies to fight and mitigate them. We address a number of questions that have not been considered in the prior literature, viz., under what circumstances a sender can know that its transmission has been successful, and under what circumstances an agent can know that the coordinator is cheating, and find concrete answers to these questions. The paper describes also a methodology based on temporal-epistemic model checking technology that can be followed to verify the shortest and longest execution time of a distributed protocol and the scenarios that lead to them.