Guaranteeing Recoverability via Partially Constrained Transaction Logs

87 0 0.0 ( 0 )

Download Cite

Added by Huan Zhou

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors H. Zhou - J. W. Guo - H. Q. Hu

Databases

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Transaction logging is an essential constituent to guarantee the atomicity and durability in online transaction processing (OLTP) systems. It always has a considerable impact on performance, especially in an in-memory database system. Conventional implementations of logging rely heavily on a centralized design, which guarantees the correctness of recovery by enforcing a total order of all operations such as log sequence number (LSN) allocation, log persistence, transaction committing and recovering. This strict sequential constraint seriously limits the scalability and parallelism of transaction logging and recovery, especially in the multi-core hardware environment. In this paper, we define recoverability for transaction logging and demonstrate its correctness for crash recovery. Based on recoverability, we propose a recoverable logging scheme named Poplar, which enables scalable and parallel log processing by easing the restrictions. Its main advantages are that (1) Poplar enables the parallel log persistence on multiple storage devices; (2) it replaces the centralized LSN allocation by calculating a partially ordered sequence number in a distributed manner, which allows log records to only track RAW and WAW dependencies among transactions; (3) it only demands transactions with RAW dependencies to be committed in serial order; (4) Poplar can concurrently restore a consistent database state based on the partially constrained logs after a crash. Experimental results show that Poplar scales well with the increase of IO devices and outperforms other logging approaches on both SSDs and emulated non-volatile memory.

rate research

Improving High Contention OLTP Performance via Transaction Scheduling

76 - Guna Prasaad , Alvin Cheung , Dan Suciu 2018

Research in transaction processing has made significant progress in improving the performance of multi-core in-memory transactional systems. However, the focus has mainly been on low-contention workloads. Modern transactional systems perform poorly on workloads with transactions accessing a few highly contended data items. We observe that most transactional workloads, including those with high contention, can be divided into clusters of data conflict-free transactions and a small set of residuals. In this paper, we introduce a new concurrency control protocol called Strife that leverages the above observation. Strife executes transactions in batches, where each batch is partitioned into clusters of conflict-free transactions and a small set of residual transactions. The conflict-free clusters are executed in parallel without any concurrency control, followed by executing the residual cluster either serially or with concurrency control. We present a low-overhead algorithm that partitions a batch of transactions into clusters that do not have cross-cluster conflicts and a small residual cluster. We evaluate Strife against the optimistic concurrency control protocol and several variants of two-phase locking, where the latter is known to perform better than other concurrency protocols under high contention, and show that Strife can improve transactional throughput by up to 2x. We also perform an in-depth micro-benchmark analysis to empirically characterize the performance and quality of our clustering algorithm

Databases

Blockchain Transaction Processing

109 - Suyash Gupta , Mohammad Sadoghi 2021

A blockchain is an append-only linked-list of blocks, which is maintained at each participating node. Each block records a set of transactions and their associated metadata. Blockchain transactions act on the identical ledger data stored at each node. Blockchain was first perceived by Satoshi Nakamoto as a peer-to-peer digital-commodity (also known as crypto-currency) exchange system. Blockchains received traction due to their inherent property of immutability-once a block is accepted, it cannot be reverted.

Databases Cryptography and Security Distributed Parallel and Cluster Computing

Mining Precision Interfaces From Query Logs

91 - Qianrui Zhang , Haoci Zhang , Thibault Sellam 2019

Interactive tools make data analysis more efficient and more accessible to end-users by hiding the underlying query complexity and exposing interactive widgets for the parts of the query that matter to the analysis. However, creating custom tailored (i.e., precise) interfaces is very costly, and automated approaches are desirable. We propose a syntactic approach that uses queries from an analysis to generate a tailored interface. We model interface widgets as functions I(q) -> q that modify the current analysis query $q$, and interfaces as the set of queries that its widgets can express. Our system, Precision Interfaces, analyzes structural changes between input queries from an analysis, and generates an output interface with widgets to express those changes. Our experiments on the Sloan Digital Sky Survey query log suggest that Precision Interfaces can generate useful interfaces for simple unanticipated tasks, and our optimizations can generate interfaces from logs of up to 10,000 queries in <10s.

Databases

Mining Precision Interfaces From Query Logs

181 - Haoci Zhang , Thibault Sellam , Eugene Wu 2017

Interactive tools make data analysis both more efficient and more accessible to a broad population. Simple interfaces such as Google Finance as well as complex visual exploration interfaces such as Tableau are effective because they are tailored to the desired user tasks. Yet, designing interactive interfaces requires technical expertise and domain knowledge. Experts are scarce and expensive, and therefore it is currently infeasible to provide tailored (or precise) interfaces for every user and every task. We envision a data-driven approach to generate tailored interactive interfaces. We observe that interactive interfaces are designed to express sets of programs; thus, samples of programs-increasingly collected by data systems-may help us build interactive interfaces. Based on this idea, Precision Interfaces is a language-agnostic system that examines an input query log, identifies how the queries structurally change, and generates interactive web interfaces to express these changes. The focus of this paper is on applying this idea towards logs of structured queries. Our experiments show that Precision Interfaces can support multiple query languages (SQL and SPARQL), derive Tableaus salient interaction components from OLAP queries, analyze <75k queries in <12 minutes, and generate interaction designs that improve upon existing interfaces and are comparable to human-crafted interfaces.

Databases

80 New Packages to Mine Database Query Logs

395 - Thibault Sellam , Martin Kersten 2017

The query log of a DBMS is a powerful resource. It enables many practical applications, including query optimization and user experience enhancement. And yet, mining SQL queries is a difficult task. The fundamental problem is that queries are symbolic objects, not vectors of numbers. Therefore, many popular statistical concepts, such as means, regression, or decision trees do not apply. Most authors limit themselves to ad hoc algorithms or approaches based on neighborhoods, such as k Nearest Neighbors. Our project is to challenge this limitation. We introduce methods to manipulate SQL queries as if they were vectors, thereby unlocking the whole statistical toolbox. We present three families of methods: feature maps, kernel methods, and Bayesian models. The first technique directly encodes queries into vectors. The second one transforms the queries implicitly. The last one exploits probabilistic graphical models as an alternative to vector spaces. We present the benefits and drawbacks of each solution, highlight how they relate to each other, and make the case for future investigation.

Databases