New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Just-Right Consistency: reconciling availability and safety

253 0 0.0 ( 0 )

Download Cite

Added by Marc Shapiro

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Marc Shapiro

Distributed Parallel and Cluster Computing Databases

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

By the CAP Theorem, a distributed data storage system can ensure either Consistency under Partition (CP) or Availability under Partition (AP), but not both. This has led to a split between CP databases, in which updates are synchronous, and AP databases, where they are asynchronous. However, there is no inherent reason to treat all updates identically: simply, the system should be as available as possible, and synchronised just enough for the application to be correct. We offer a principled Just-Right Consistency approach to designing such applications, reconciling correctness with availability and performance, based on the following insights:(i) The Conflict-Free Replicated Data Type (CRDTs) data model supports asynchronous updates in an intuitive and principled way.(ii) Invariants involving joint or mutually-ordered updates are compatible with AP and can be guaranteed by Transactional Causal Consistency, the strongest consistency model that does not compromise availability. Regarding the remaining, CAP-sensitive invariants:(iii) For the common pattern of Bounded Counters, we provide encapsulated data type that is proven correct and is efficient; (iv) in the general case, static analysis can identify when synchronisation is not necessary for correctness.Our Antidote cloud database system supports CRDTs, Transactional Causal Consistency and the Bounded Counter data type. Support tools help design applications by static analysis and proof of CAP-sensitive invariants. This system supports industrial-grade applications and has been tested experimentally with hundreds of servers across several geo-distributed data centres.

rate research

Just-in-Time Dynamic-Batching

79 - Sheng Zha , Ziheng Jiang , Haibin Lin 2019

Batching is an essential technique to improve computation efficiency in deep learning frameworks. While batch processing for models with static feed-forward computation graphs is straightforward to implement, batching for dynamic computation graphs such as syntax trees or social network graphs is challenging due to variable computation graph structure across samples. Through simulation and analysis of a Tree-LSTM model, we show the key trade-off between graph analysis time and batching effectiveness in dynamic batching. Based on this finding, we propose a dynamic batching method as an extension to MXNet Gluons just-in-time compilation (JIT) framework. We show empirically that our method yields up to 6.25 times speed-up on a common dynamic workload, a tree-LSTM model for the semantic relatedness task.

Distributed Parallel and Cluster Computing Databases

Seeing is Believing: A Unified Model for Consistency and Isolation via States

155 - Natacha Crooks , Youer Pu , Lorenzo Alvisi 2016

This paper introduces a unified model of consistency and isolation that minimizes the gap between how these guarantees are defined and how they are perceived. Our approach is premised on a simple observation: applications view storage systems as black-boxes that transition through a series of states, a subset of which are observed by applications. For maximum clarity, isolation and consistency guarantees should be expressed as constraints on those states. Instead, these properties are currently expressed as constraints on operation histories that are not visible to the application. We show that adopting a state-based approach to expressing these guarantees brings forth several benefits. First, it makes it easier to focus on the anomalies that a given isolation or consistency level allows (and that applications must deal with), rather than those that it proscribes. Second, it unifies the often disparate theories of isolation and consistency and provides a structure for composing these guarantees. We leverage this modularity to apply to transactions (independently of the isolation level under which they execute) the equivalence between causal consistency and session guarantees that Chockler et al. had proved for single operations. Third, it brings clarity to the increasingly crowded field of proposed consistency and isolation properties by winnowing spurious distinctions: we find that the recently proposed parallel snapshot isolation introduced by Sovran et al. is in fact a specific implementation of an older guarantee, lazy consistency (or PL-2+), introduced by Adya et al.

Distributed Parallel and Cluster Computing Databases

Reconciling Supersymmetry and Left-Right Symmetry

64 - C.S. Aulakh , K. Benakli , 1997

We construct the minimal supersymmetric left-right theory and show that at the renormalizable level it requires the existence of an intermediate $B-L$ breaking scale. The subsequent symmetry breaking down to MSSM automatically preserves R-symmetry. Furthermore, unlike in the nonsupersymmetric version of the theory, the see-saw mechanism takes its canonical form. The theory predicts the existence of a triplet of Higgs scalars much lighter than the $B-L$ breaking scale.

High Energy Physics - Phenomenology

Exploring Erasure Coding Techniques for High Availability of Intermediate Data

85 - Zhe Zhang , Brian Bockelman , Derek Weitzel 2020

Scientific computing workflows generate enormous distributed data that is short-lived, yet critical for job completion time. This class of data is called intermediate data. A common way to achieve high data availability is to replicate data. However, an increasing scale of intermediate data generated in modern scientific applications demands new storage techniques to improve storage efficiency. Erasure Codes, as an alternative, can use less storage space while maintaining similar data availability. In this paper, we adopt erasure codes for storing intermediate data and compare its performance with replication. We also use the metric of Mean-Time-To-Data-Loss (MTTDL) to estimate the lifetime of intermediate data. We propose an algorithm to proactively relocate data redundancy from vulnerable machines to reliable ones to improve data availability with some extra network overhead. Furthermore, we propose an algorithm to assign redundancy units of data physically close to each other on the network to reduce the network bandwidth for reconstructing data when it is being accessed.

Distributed Parallel and Cluster Computing

Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review

115 - Eugene Yang , Sean MacAvaney , David D. Lewis 2021

Technology-assisted review (TAR) refers to iterative active learning workflows for document review in high recall retrieval (HRR) tasks. TAR research and most commercial TAR software have applied linear models such as logistic regression or support vector machines to lexical features. Transformer-based models with supervised tuning have been found to improve effectiveness on many text classification tasks, suggesting their use in TAR. We indeed find that the pre-trained BERT model reduces review volume by 30% in TAR workflows simulated on the RCV1-v2 newswire collection. In contrast, we find that linear models outperform BERT for simulated legal discovery topics on the Jeb Bush e-mail collection. This suggests the match between transformer pre-training corpora and the task domain is more important than generally appreciated. Additionally, we show that just-right language model fine-tuning on the task collection before starting active learning is critical. Both too little or too much fine-tuning results in performance worse than that of linear models, even for RCV1-v2.

Information Retrieval Computation and Language

comments

Fetching comments

Cordoba Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Just-Right Consistency: reconciling availability and safety

Ask ChatGPT about the research

No Arabic abstract

Read More