Rethinking serializable multiversion concurrency control

530 0 0.0 ( 0 )

Download Cite

Added by Jose Faleiro

Publication date 2014

fields Informatics Engineering

and research's language is English

Authors Jose M. Faleiro - Daniel J. Abadi

Databases

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Multi-versioned database systems have the potential to significantly increase the amount of concurrency in transaction processing because they can avoid read-write conflicts. Unfortunately, the increase in concurrency usually comes at the cost of transaction serializability. If a database user requests full serializability, modern multi-versioned systems significantly constrain read-write concurrency among conflicting transactions and employ expensive synchronization patterns in their design. In main-memory multi-core settings, these additional constraints are so burdensome that multi-versioned systems are often significantly outperformed by single-version systems. We propose Bohm, a new concurrency control protocol for main-memory multi-versioned database systems. Bohm guarantees serializable execution while ensuring that reads never block writes. In addition, Bohm does not require reads to perform any book-keeping whatsoever, thereby avoiding the overhead of tracking reads via contended writes to shared memory. This leads to excellent scalability and performance in multi-core settings. Bohm has all the above characteristics without performing validation based concurrency control. Instead, it is pessimistic, and is therefore not prone to excessive aborts in the presence of contention. An experimental evaluation shows that Bohm performs well in both high contention and low contention settings, and is able to dramatically outperform state-of-the-art multi-versioned systems despite maintaining the full set of serializability guarantees.

rate research

Polyjuice: High-Performance Transactions via Learned Concurrency Control

123 - Jiachen Wang , Ding Ding , Huan Wang 2021

Concurrency control algorithms are key determinants of the performance of in-memory databases. Existing algorithms are designed to work well for certain workloads. For example, optimistic concurrency control (OCC) is better than two-phase-locking (2PL) under low contention, while the converse is true under high contention. To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. Instead of choosing among a small number of known algorithms, our approach searches in a policy space of fine-grained actions, resulting in novel algorithms that can outperform existing algorithms by specializing to a given workload. We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%.

Databases

Multiversion Concurrency with Bounded Delay and Precise Garbage Collection

69 - Naama Ben-David , Guy E. Blelloch , Yihan Sun 2018

In this paper we are interested in bounding the number of instructions taken to process transactions. The main result is a multiversion transactional system that supports constant delay (extra instructions beyond running in isolation) for all read-only transactions, delay equal to the number of processes for writing transactions that are not concurrent with other writers, and lock-freedom for concurrent writers. The system supports precise garbage collection in th

Distributed Parallel and Cluster Computing Data Structures and Algorithms

DGCC:A New Dependency Graph based Concurrency Control Protocol for Multicore Database Systems

908 - Chang Yao , Divyakant Agrawal , Pengfei Chang 2015

Multicore CPUs and large memories are increasingly becoming the norm in modern computer systems. However, current database management systems (DBMSs) are generally ineffective in exploiting the parallelism of such systems. In particular, contention can lead to a dramatic fall in performance. In this paper, we propose a new concurrency control protocol called DGCC (Dependency Graph based Concurrency Control) that separates concurrency control from execution. DGCC builds dependency graphs for batched transactions before executing them. Using these graphs, contentions within the same batch of transactions are resolved before execution. As a result, the execution of the transactions does not need to deal with contention while maintaining full equivalence to that of serialized execution. This better exploits multicore hardware and achieves higher level of parallelism. To facilitate DGCC, we have also proposed a system architecture that does not have certain centralized control components yielding better scalability, as well as supports a more efficient recovery mechanism. Our extensive experimental study shows that DGCC achieves up to four times higher throughput compared to that of state-of-the-art concurrency control protocols for high contention workloads.

Databases

An Analysis of Concurrency Control Protocols for In-Memory Databases with CCBench (Extended Version)

107 - Takayuki Tanabe , Takashi Hoshino , Hideyuki Kawashima 2020

This paper presents yet another concurrency control analysis platform, CCBench. CCBench supports seven protocols (Silo, TicToc, MOCC, Cicada, SI, SI with latch-free SSN, 2PL) and seven versatile optimization methods and enables the configuration of seven workload parameters. We analyzed the protocols and optimization methods using various workload parameters and a thread count of 224. Previous studies focused on thread scalability and did not explore the space analyzed here. We classified the optimization methods on the basis of three performance factors: CPU cache, delay on conflict, and version lifetime. Analyses using CCBench and 224 threads, produced six insights. (I1) The performance of optimistic concurrency control protocol for a read only workload rapidly degrades as cardinality increases even without L3 cache misses. (I2) Silo can outperform TicToc for some write-intensive workloads by using invisible reads optimization. (I3) The effectiveness of two approaches to coping with conflict (wait and no-wait) depends on the situation. (I4) OCC reads the same record two or more times if a concurrent transaction interruption occurs, which can improve performance. (I5) Mixing different implementations is inappropriate for deep analysis. (I6) Even a state-of-the-art garbage collection method cannot improve the performance of multi-version protocols if there is a single long transaction mixed into the workload. On the basis of I4, we defined the read phase extension optimization in which an artificial delay is added to the read phase. On the basis of I6, we defined the aggressive garbage collection optimization in which even visibl

Databases

Concurrency Protocol Aiming at High Performance of Execution and Replay for Smart Contracts

154 - Shuaifeng Pang , Xiaodong Qi , Zhao Zhang 2019

Although the emergence of the programmable smart contract makes blockchain systems easily embrace a wider range of industrial areas, how to execute smart contracts efficiently becomes a big challenge nowadays. Due to the existence of Byzantine nodes, the mechanism of executing smart contracts is quite different from that in database systems, so that existing successful concurrency control protocols in database systems cannot be employed directly. Moreover, even though smart contract execution follows a two-phase style, i.e, the miner node executes a batch of smart contracts in the first phase and the validators replay them in the second phase, existing parallel solutions only focus on the optimization in the first phase, but not including the second phase. In this paper, we propose a novel efficient concurrency control scheme which is the first one to do optimization in both phases. Specifically, (i) in the first phase, we give a variant of OCC (Optimistic Concurrency Control) protocol based on {em batching} feature to improve the concurrent execution efficiency for the miner and produce a schedule log with high parallelism for validators. Also, a graph partition algorithm is devised to divide the original schedule log into small pieces and further reduce the communication cost; and (ii) in the second phase, we give a deterministic OCC protocol to replay all smart contracts efficiently on multi-core validators where all cores can replay smart contracts independently. Theoretical analysis and extensive experimental results illustrate that the proposed scheme outperforms state-of-art solutions significantly.

Databases Distributed Parallel and Cluster Computing