Revisiting Fast Practical Byzantine Fault Tolerance

149 0 0.0 ( 0 )

Download Cite

Added by Dahlia Malkhi

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Ittai Abraham - Guy Gueta - Dahlia Malkhi

Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this note, we observe a safety violation in Zyzzyva and a liveness violation in FaB. To demonstrate these issues, we require relatively simple scenarios, involving only four replicas, and one or two view changes. In all of them, the problem is manifested already in the first log slot.

rate research

Approximate Byzantine Fault-Tolerance in Distributed Optimization

94 - Shuo Liu , Nirupam Gupta , Nitin H. Vaidya 2021

This paper considers the problem of Byzantine fault-tolerance in distributed multi-agent optimization. In this problem, each agent has a local cost function, and in the fault-free case, the goal is to design a distributed algorithm that allows all the agents to find a minimum point of all the agents aggregate cost function. We consider a scenario where some agents might be Byzantine faulty that renders the original goal of computing a minimum point of all the agents aggregate cost vacuous. A more reasonable objective for an algorithm in this scenario is to allow all the non-faulty agents to compute the minimum point of only the non-faulty agents aggregate cost. Prior work shows that if there are up to $f$ (out of $n$) Byzantine agents then a minimum point of the non-faulty agents aggregate cost can be computed exactly if and only if the non-faulty agents costs satisfy a certain redundancy property called $2f$-redundancy. However, $2f$-redundancy is an ideal property that can be satisfied only in systems free from noise or uncertainties, which can make the goal of exact fault-tolerance unachievable in some applications. Thus, we introduce the notion of $(f,epsilon)$-resilience, a generalization of exact fault-tolerance wherein the objective is to find an approximate minimum point of the non-faulty aggregate cost, with $epsilon$ accuracy. This approximate fault-tolerance can be achieved under a weaker condition that is easier to satisfy in practice, compared to $2f$-redundancy. We obtain necessary and sufficient conditions for achieving $(f,epsilon)$-resilience characterizing the correlation between relaxation in redundancy and approximation in resilience. In case when the agents cost functions are differentiable, we obtain conditions for $(f,epsilon)$-resilience of the distributed gradient-descent method when equipped with robust gradient aggregation.

Distributed Parallel and Cluster Computing

Stochastic Performance Modeling for Practical Byzantine Fault Tolerance Consensus in Blockchain

73 - Fan-Qi Ma , Quan-Lin Li , Yi-Han Liu 2021

The practical Byzantine fault tolerant (PBFT) consensus mechanism is one of the most basic consensus algorithms (or protocols) in blockchain technologies, thus its performance evaluation is an interesting and challenging topic due to a higher complexity of its consensus work in the peer-to-peer network. This paper describes a simple stochastic performance model of the PBFT consensus mechanism, which is refined as not only a queueing system with complicated service times but also a level-independent quasi-birth-and-death (QBD) process. From the level-independent QBD process, we apply the matrix-geometric solution to obtain a necessary and sufficient condition under which the PBFT consensus system is stable, and to be able to numerically compute the stationary probability vector of the QBD process. Thus we provide four useful performance measures of the PBFT consensus mechanism, and can numerically calculate the four performance measures. Finally, we use some numerical examples to verify the validity of our theoretical results, and show how the four performance measures are influenced by some key parameters of the PBFT consensus. By means of the theory of multi-dimensional Markov processes, we are optimistic that the methodology and results given in this paper are applicable in a wide range research of PBFT consensus mechanism and even other types of consensus mechanisms.

Cryptography and Security Databases Performance

Dynamic Fault Tolerance Through Resource Pooling

256 - Christian M. Fuchs , Nadia M. Murillo , Aske Plaat 2019

Miniaturized satellites are currently not considered suitable for critical, high-priority, and complex multi-phased missions, due to their low reliability. As hardware-side fault tolerance (FT) solutions designed for larger spacecraft can not be adopted aboard very small satellites due to budget, energy, and size constraints, we developed a hybrid FT-approach based upon only COTS components, commodity processor cores, library IP, and standard software. This approach facilitates fault detection, isolation, and recovery in software, and utilizes fault-coverage techniques across the embedded stack within an multiprocessor system-on-chip (MPSoC). This allows our FPGA-based proof-of-concept implementation to deliver strong fault-coverage even for missions with a long duration, but also to adapt to varying performance requirements during the mission. The operator of a spacecraft utilizing this approach can define performance profiles, which allow an on-board computer (OBC) to trade between processing capacity, fault coverage, and energy consumption using simple heuristics. The software-side FT approach developed also offers advantages if deployed aboard larger spacecraft through spare resource pooling, enabling an OBC to more efficiently handle permanent faults. This FT approach in part mimics a critical biological systemss way of tolerating and adjusting to failures, enabling graceful ageing of an MPSoC.

Distributed Parallel and Cluster Computing Operating Systems Systems and Control

A Fault-Tolerance Shim for Serverless Computing

204 - Vikram Sreekanti , Chenggang Wu , Saurav Chhatrapati 2020

Serverless computing has grown in popularity in recent years, with an increasing number of applications being built on Functions-as-a-Service (FaaS) platforms. By default, FaaS platforms support retry-based fault tolerance, but this is insufficient for programs that modify shared state, as they can unwittingly persist partial sets of updates in case of failures. To address this challenge, we would like atomic visibility of the updates made by a FaaS application. In this paper, we present AFT, an atomic fault tolerance shim for serverless applications. AFT interposes between a commodity FaaS platform and storage engine and ensures atomic visibility of updates by enforcing the read atomic isolation guarantee. AFT supports new protocols to guarantee read atomic isolation in the serverless setting. We demonstrate that aft introduces minimal overhead relative to existing storage engines and scales smoothly to thousands of requests per second, while preventing a significant number of consistency anomalies.

Distributed Parallel and Cluster Computing Databases

Practical Byzantine Reliable Broadcast on Partially Connected Networks

71 - Silvia Bonomi , Jeremie Decouchant , Giovanni Farina andn Vincent Rahli 2021

In this paper, we consider the Byzantine reliable broadcast problem on authenticated and partially connected networks. The state-of-the-art method to solve this problem consists in combining two algorithms from the literature. Handling asynchrony and faulty senders is typically done thanks to Gabriel Brachas authenticated double-echo broadcast protocol, which assumes an asynchronous fully connected network. Danny Dolevs algorithm can then be used to provide reliable communications between processes in the global fault model, where up to f processes among N can be faulty in a communication network that is at least 2f+1-connected. Following recent works that showed that Dolevs protocol can be made more practical thanks to several optimizations, we show that the state-of-the-art methods to solve our problem can be optimized thanks to layer-specific and cross-layer optimizations. Our simulations with the Omnet++ network simulator show that these optimizations can be efficiently combined to decrease the total amount of information transmitted or the protocols latency (e.g., respectively, -25% and -50% with a 16B payload, N=31 and f=4) compared to the state-of-the-art combination of Brachas and Dolevs protocols.

Distributed Parallel and Cluster Computing Data Structures and Algorithms Networking and Internet Architecture

comments

Fetching comments

Kalamoon Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Revisiting Fast Practical Byzantine Fault Tolerance

Ask ChatGPT about the research

No Arabic abstract

In this note, we observe a safety violation in Zyzzyva and a liveness violation in FaB. To demonstrate these issues, we require relatively simple scenarios, involving only four replicas, and one or two view changes. In all of them, the problem is manifested already in the first log slot.

Read More