Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A Fault-Tolerance Shim for Serverless Computing

205 0 0.0 ( 0 )

Download Cite

Added by Vikram Sreekanti

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Vikram Sreekanti - Chenggang Wu - Saurav Chhatrapati

Distributed Parallel and Cluster Computing Databases

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Serverless computing has grown in popularity in recent years, with an increasing number of applications being built on Functions-as-a-Service (FaaS) platforms. By default, FaaS platforms support retry-based fault tolerance, but this is insufficient for programs that modify shared state, as they can unwittingly persist partial sets of updates in case of failures. To address this challenge, we would like atomic visibility of the updates made by a FaaS application. In this paper, we present AFT, an atomic fault tolerance shim for serverless applications. AFT interposes between a commodity FaaS platform and storage engine and ensures atomic visibility of updates by enforcing the read atomic isolation guarantee. AFT supports new protocols to guarantee read atomic isolation in the serverless setting. We demonstrate that aft introduces minimal overhead relative to existing storage engines and scales smoothly to thousands of requests per second, while preventing a significant number of consistency anomalies.

rate research

Bringing Fault-Tolerant GigaHertz-Computing to Space: A Multi-Stage Software-Side Fault-Tolerance Approach for Miniaturized Spacecraft

75 - Christian M. Fuchs , Todor Stefanov , Nadia Murillo 2017

Modern embedded technology is a driving factor in satellite miniaturization, contributing to a massive boom in satellite launches and a rapidly evolving new space industry. Miniaturized satellites, however, suffer from low reliability, as traditional hardware-based fault-tolerance (FT) concepts are ineffective for on-board computers (OBCs) utilizing modern systems-on-a-chip (SoC). Therefore, larger satellites continue to rely on proven processors with large feature sizes. Software-based concepts have largely been ignored by the space industry as they were researched only in theory, and have not yet reached the level of maturity necessary for implementation. We present the first integral, real-world solution to enable fault-tolerant general-purpose computing with modern multiprocessor-SoCs (MPSoCs) for spaceflight, thereby enabling their use in future high-priority space missions. The presented multi-stage approach consists of three FT stages, combining coarse-grained thread-level distributed self-validation, FPGA reconfiguration, and mixed criticality to assure long-term FT and excellent scalability for both resource constrained and critical high-priority space missions. Early benchmark results indicate a drastic performance increase over state-of-the-art radiation-hard OBC designs and considerably lower software- and hardware development costs. This approach was developed for a 4-year European Space Agency (ESA) project, and we are implementing a tiled MPSoC prototype jointly with two industrial partners.

Distributed Parallel and Cluster Computing Operating Systems

Dynamic Fault Tolerance Through Resource Pooling

256 - Christian M. Fuchs , Nadia M. Murillo , Aske Plaat 2019

Miniaturized satellites are currently not considered suitable for critical, high-priority, and complex multi-phased missions, due to their low reliability. As hardware-side fault tolerance (FT) solutions designed for larger spacecraft can not be adopted aboard very small satellites due to budget, energy, and size constraints, we developed a hybrid FT-approach based upon only COTS components, commodity processor cores, library IP, and standard software. This approach facilitates fault detection, isolation, and recovery in software, and utilizes fault-coverage techniques across the embedded stack within an multiprocessor system-on-chip (MPSoC). This allows our FPGA-based proof-of-concept implementation to deliver strong fault-coverage even for missions with a long duration, but also to adapt to varying performance requirements during the mission. The operator of a spacecraft utilizing this approach can define performance profiles, which allow an on-board computer (OBC) to trade between processing capacity, fault coverage, and energy consumption using simple heuristics. The software-side FT approach developed also offers advantages if deployed aboard larger spacecraft through spare resource pooling, enabling an OBC to more efficiently handle permanent faults. This FT approach in part mimics a critical biological systemss way of tolerating and adjusting to failures, enabling graceful ageing of an MPSoC.

Distributed Parallel and Cluster Computing Operating Systems Systems and Control

Revisiting Fast Practical Byzantine Fault Tolerance

148 - Ittai Abraham , Guy Gueta , Dahlia Malkhi 2017

In this note, we observe a safety violation in Zyzzyva and a liveness violation in FaB. To demonstrate these issues, we require relatively simple scenarios, involving only four replicas, and one or two view changes. In all of them, the problem is manifested already in the first log slot.

Distributed Parallel and Cluster Computing

SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing

106 - Marcin Copik , Grzegorz Kwasniewski , Maciej Besta 2020

Function-as-a-Service (FaaS) is one of the most promising directions for the future of cloud services, and serverless functions have immediately become a new middleware for building scalable and cost-efficient microservices and applications. However, the quickly moving technology hinders reproducibility, and the lack of a standardized benchmarking suite leads to ad-hoc solutions and microbenchmarks being used in serverless research, further complicating metaanalysis and comparison of research solutions. To address this challenge, we propose the Serverless Benchmark Suite: the first benchmark for FaaS computing that systematically covers a wide spectrum of cloud resources and applications. Our benchmark consists of the specification of representative workloads, the accompanying implementation and evaluation infrastructure, and the evaluation methodology that facilitates reproducibility and enables interpretability. We demonstrate that the abstract model of a FaaS execution environment ensures the applicability of our benchmark to multiple commercial providers such as AWS, Azure, and Google Cloud. Our work facilities experimental evaluation of serverless systems, and delivers a standardized, reliable and evolving evaluation methodology of performance, efficiency, scalability and reliability of middleware FaaS platforms.

Distributed Parallel and Cluster Computing

Stochastic Performance Modeling for Practical Byzantine Fault Tolerance Consensus in Blockchain

73 - Fan-Qi Ma , Quan-Lin Li , Yi-Han Liu 2021

The practical Byzantine fault tolerant (PBFT) consensus mechanism is one of the most basic consensus algorithms (or protocols) in blockchain technologies, thus its performance evaluation is an interesting and challenging topic due to a higher complexity of its consensus work in the peer-to-peer network. This paper describes a simple stochastic performance model of the PBFT consensus mechanism, which is refined as not only a queueing system with complicated service times but also a level-independent quasi-birth-and-death (QBD) process. From the level-independent QBD process, we apply the matrix-geometric solution to obtain a necessary and sufficient condition under which the PBFT consensus system is stable, and to be able to numerically compute the stationary probability vector of the QBD process. Thus we provide four useful performance measures of the PBFT consensus mechanism, and can numerically calculate the four performance measures. Finally, we use some numerical examples to verify the validity of our theoretical results, and show how the four performance measures are influenced by some key parameters of the PBFT consensus. By means of the theory of multi-dimensional Markov processes, we are optimistic that the methodology and results given in this paper are applicable in a wide range research of PBFT consensus mechanism and even other types of consensus mechanisms.

Cryptography and Security Databases Performance

comments

Fetching comments

Sohag University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Fault-Tolerance Shim for Serverless Computing

Ask ChatGPT about the research

No Arabic abstract

Read More