On Barriers and the Gap between Active and Passive Replication (Full Version)

108 0 0.0 ( 0 )

Download Cite

Added by Marco Serafini

Publication date 2013

fields Informatics Engineering

and research's language is English

Authors Flavio P. Junqueira - Marco Serafini

Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Active replication is commonly built on top of the atomic broadcast primitive. Passive replication, which has been recently used in the popular ZooKeeper coordination system, can be naturally built on top of the primary-order atomic broadcast primitive. Passive replication differs from active replication in that it requires processes to cross a barrier before they become primaries and start broadcasting messages. In this paper, we propose a barrier function tau that explains and encapsulates the differences between existing primary-order atomic broadcast algorithms, namely semi-passive replication and Zookeeper atomic broadcast (Zab), as well as the differences between Paxos and Zab. We also show that implementing primary-order atomic broadcast on top of a generic consensus primitive and tau inherently results in higher time complexity than atomic broadcast, as witnessed by existing algorithms. We overcome this problem by presenting an alternative, primary-order atomic broadcast implementation that builds on top of a generic consensus primitive and uses consensus itself to form a barrier. This algorithm is modular and matches the time complexity of existing tau-based algorithms.

rate research

Efficient Replication via Timestamp Stability (Extended Version)

70 - Vitor Enes , Carlos Baquero , Alexey Gotsman 2021

Modern web applications replicate their data across the globe and require strong consistency guarantees for their most critical data. These guarantees are usually provided via state-machine replication (SMR). Recent advances in SMR have focused on leaderless protocols, which improve the availability and performance of traditional Paxos-based solutions. We propose Tempo - a leaderless SMR protocol that, in comparison to prior solutions, achieves superior throughput and offers predictable performance even in contended workloads. To achieve these benefits, Tempo timestamps each application command and executes it only after the timestamp becomes stable, i.e., all commands with a lower timestamp are known. Both the timestamping and stability detection mechanisms are fully decentralized, thus obviating the need for a leader replica. Our protocol furthermore generalizes to partial replication settings, enabling scalability in highly parallel workloads. We evaluate the protocol in both real and simulated geo-distributed environments and demonstrate that it outperforms state-of-the-art alternatives.

Distributed Parallel and Cluster Computing

State-Machine Replication for Planet-Scale Systems (Extended Version)

71 - Vitor Enes , Carlos Baquero , Tuanir Franc{c}a Rezende 2020

Online applications now routinely replicate their data at multiple sites around the world. In this paper we present Atlas, the first state-machine replication protocol tailored for such planet-scale systems. Atlas does not rely on a distinguished leader, so clients enjoy the same quality of service independently of their geographical locations. Furthermore, client-perceived latency improves as we add sites closer to clients. To achieve this, Atlas minimizes the size of its quorums using an observation that concurrent data center failures are rare. It also processes a high percentage of accesses in a single round trip, even when these conflict. We experimentally demonstrate that Atlas consistently outperforms state-of-the-art protocols in planet-scale scenarios. In particular, Atlas is up to two times faster than Flexible Paxos with identical failure assumptions, and more than doubles the performance of Egalitarian Paxos in the YCSB benchmark.

Distributed Parallel and Cluster Computing

Modelling cytoskeletal traffic: an interplay between passive diffusion and active transport

165 - I. Neri , N. Kern , A. Parmeggiani 2012

We introduce the totally asymmetric exclusion process with Langmuir kinetics (TASEP-LK) on a network as a microscopic model for active motor protein transport on the cytoskeleton, immersed in the diffusive cytoplasm. We discuss how the interplay between active transport along a network and infinite diffusion in a bulk reservoir leads to a heterogeneous matter distribution on various scales. We find three regimes for steady state transport, corresponding to the scale of the network, of individual segments or local to sites. At low exchange rates strong density heterogeneities develop between different segments in the network. In this regime one has to consider the topological complexity of the whole network to describe transport. In contrast, at moderate exchange rates the transport through the network decouples, and the physics is determined by single segments and the local topology. At last, for very high exchange rates the homogeneous Langmuir process dominates the stationary state. We introduce effective rate diagrams for the network to identify these different regimes. Based on this method we develop an intuitive but generic picture of how the stationary state of excluded volume processes on complex networks can be understood in terms of the single-segment phase diagram.

Statistical Mechanics Soft Condensed Matter Cellular Automata and Lattice Gases

Hermes: a Fast, Fault-Tolerant and Linearizable Replication Protocol

215 - A. Katsarakis 2020

Todays datacenter applications are underpinned by datastores that are responsible for providing availability, consistency, and performance. For high availability in the presence of failures, these datastores replicate data across several nodes. This is accomplished with the help of a reliable replication protocol that is responsible for maintaining the replicas strongly-consistent even when faults occur. Strong consistency is preferred to weaker consistency models that cannot guarantee an intuitive behavior for the clients. Furthermore, to accommodate high demand at real-time latencies, datastores must deliver high throughput and low latency. This work introduces Hermes, a broadcast-based reliable replication protocol for in-memory datastores that provides both high throughput and low latency by enabling local reads and fully-concurrent fast writes at all replicas. Hermes couples logical timestamps with cache-coherence-inspired invalidations to guarantee linearizability, avoid write serialization at a centralized ordering point, resolve write conflicts locally at each replica (hence ensuring that writes never abort) and provide fault-tolerance via replayable writes. Our implementation of Hermes over an RDMA-enabled reliable datastore with five replicas shows that Hermes consistently achieves higher throughput than state-of-the-art RDMA-based reliable protocols (ZAB and CRAQ) across all write ratios while also significantly reducing tail latency. At 5% writes, the tail latency of Hermes is 3.6X lower than that of CRAQ and ZAB.

Distributed Parallel and Cluster Computing

Synergy via Redundancy: Adaptive Replication Strategies and Fundamental Limits

157 - Gauri Joshi , Dhruva Kaushal 2020

The maximum possible throughput (or the rate of job completion) of a multi-server system is typically the sum of the service rates of individual servers. Recent work shows that launching multiple replicas of a job and canceling them as soon as one copy finishes can boost the throughput, especially when the service time distribution has high variability. This means that redundancy can, in fact, create synergy among servers such that their overall throughput is greater than the sum of individual servers. This work seeks to find the fundamental limit of the throughput boost achieved by job replication and the optimal replication policy to achieve it. While most previous works consider upfront replication policies, we expand the set of possible policies to delayed launch of replicas. The search for the optimal adaptive replication policy can be formulated as a Markov Decision Process, using which we propose two myopic replication policies, MaxRate and AdaRep, to adaptively replicate jobs. In order to quantify the optimality gap of these and other policies, we derive upper bounds on the service capacity, which provide fundamental limits on the throughput of queueing systems with redundancy.

Distributed Parallel and Cluster Computing Information Theory Information Theory