Scheduling OLTP Transactions via Machine Learning

107 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Anthony Tomasic

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yangjun Sheng - Anthony Tomasic - Tieying Zhang

قواعد البيانات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Current main memory database system architectures are still challenged by high contention workloads and this challenge will continue to grow as the number of cores in processors continues to increase. These systems schedule transactions randomly across cores to maximize concurrency and to produce a uniform load across cores. Scheduling never considers potential conflicts. Performance could be improved if scheduling balanced between concurrency to maximize throughput and scheduling transactions linearly to avoid conflicts. In this paper, we present the design of several intelligent transaction scheduling algorithms that consider both potential transaction conflicts and concurrency. To incorporate reasoning about transaction conflicts, we develop a supervised machine learning model that estimates the probability of conflict. This model is incorporated into several scheduling algorithms. In addition, we integrate an unsupervised machine learning algorithm into an intelligent scheduling algorithm. We then empirically measure the performance impact of different scheduling algorithms on OLTP and social networking workloads. Our results show that, with appropriate settings, intelligent scheduling can increase throughput by 54% and reduce abort rate by 80% on a 20-core machine, relative to random scheduling. In summary, the paper provides preliminary evidence that intelligent scheduling significantly improves DBMS performance.

قيم البحث

76 - Guna Prasaad , Alvin Cheung , Dan Suciu 2018

Research in transaction processing has made significant progress in improving the performance of multi-core in-memory transactional systems. However, the focus has mainly been on low-contention workloads. Modern transactional systems perform poorly o n workloads with transactions accessing a few highly contended data items. We observe that most transactional workloads, including those with high contention, can be divided into clusters of data conflict-free transactions and a small set of residuals. In this paper, we introduce a new concurrency control protocol called Strife that leverages the above observation. Strife executes transactions in batches, where each batch is partitioned into clusters of conflict-free transactions and a small set of residual transactions. The conflict-free clusters are executed in parallel without any concurrency control, followed by executing the residual cluster either serially or with concurrency control. We present a low-overhead algorithm that partitions a batch of transactions into clusters that do not have cross-cluster conflicts and a small residual cluster. We evaluate Strife against the optimistic concurrency control protocol and several variants of two-phase locking, where the latter is known to perform better than other concurrency protocols under high contention, and show that Strife can improve transactional throughput by up to 2x. We also perform an in-depth micro-benchmark analysis to empirically characterize the performance and quality of our clustering algorithm

قواعد البيانات

Polyjuice: High-Performance Transactions via Learned Concurrency Control

123 - Jiachen Wang , Ding Ding , Huan Wang 2021

Concurrency control algorithms are key determinants of the performance of in-memory databases. Existing algorithms are designed to work well for certain workloads. For example, optimistic concurrency control (OCC) is better than two-phase-locking (2P L) under low contention, while the converse is true under high contention. To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. Instead of choosing among a small number of known algorithms, our approach searches in a policy space of fine-grained actions, resulting in novel algorithms that can outperform existing algorithms by specializing to a given workload. We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%.

قواعد البيانات

L-Store: A Real-time OLTP and OLAP System

85 - Mohammad Sadoghi , Souvik Bhattacherjee , Bishwaranjan Bhattacharjee 2016

Arguably data is the new natural resource in the enterprise world with an unprecedented degree of proliferation. But to derive real-time actionable insights from the data, it is important to bridge the gap between managing the data that is being upda ted at a high velocity (i.e., OLTP) and analyzing a large volume of data (i.e., OLAP). However, there has been a divide where specialized solutions were often deployed to support either OLTP or OLAP workloads but not both; thus, limiting the analysis to stale and possibly irrelevant data. In this paper, we present Lineage-based Data Store (L-Store) that combines the real-time processing of transactional and analytical workloads within a single unified engine by introducing a novel lineage-based storage architecture. By exploiting the lineage, we develop a contention-free and lazy staging of columnar data from a write-optimized form (suitable for OLTP) into a read-optimized form (suitable for OLAP) in a transactionally consistent approach that also supports querying and retaining the current and historic data. Our working prototype of L-Store demonstrates its superiority compared to state-of-the-art approaches under a comprehensive experimental evaluation.

قواعد البيانات

Design Principles for Scaling Multi-core OLTP Under High Contention

90 - Kun Ren , Jose M. Faleiro , Daniel J. Abadi 2015

Although significant recent progress has been made in improving the multi-core scalability of high throughput transactional database systems, modern systems still fail to achieve scalable throughput for workloads involving frequent access to highly c ontended data. Most of this inability to achieve high throughput is explained by the fundamental constraints involved in guaranteeing ACID --- the addition of cores results in more concurrent transactions accessing the same contended data for which access must be serialized in order to guarantee isolation. Thus, linear scalability for contended workloads is impossible. However, there exist flaws in many modern architectures that exacerbate their poor scalability, and result in throughput that is much worse than fundamentally required by the workload. In this paper we identify two prevalent design principles that limit the multi-core scalability of many (but not all) transactional database systems on contended workloads: the multi-purpose nature of execution threads in these systems, and the lack of advanced planning of data access. We demonstrate the deleterious results of these design principles by implementing a prototype system, ORTHRUS, that is motivated by the principles of separation of database component functionality and advanced planning of transactions. We find that these two principles alone result in significantly improved scalability on high-contention workloads, and an order of magnitude increase in throughput for a non-trivial subset of these contended workloads.

قواعد البيانات النظم الموزعة والتوازية والحوسبة العنقودية

G-Tran: Making Distributed Graph Transactions Fast

98 - Hongzhi Chen , Changji Li , Chenguang Zheng 2021

Graph transaction processing raises many unique challenges such as random data access due to the irregularity of graph structures, low throughput and high abort rate due to the relatively large read/write sets in graph transactions. To address these challenges, we present G-Tran -- an RDMA-enabled distributed in-memory graph database with serializable and snapshot isolation support. First, we propose a graph-native data store to achieve good data locality and fast data access for transactional updates and queries. Second, G-Tran adopts a fully decentralized architecture that leverages RDMA to process distributed transactions with the MPP model, which can achieve high performance by utilizing all computing resources. In addition, we propose a new MV-OCC implementation with two optimizations to address the issue of large read/write sets in graph transactions. Extensive experiments show that G-Tran achieves competitive performance compared with other popular graph databases on benchmark workloads.

قواعد البيانات النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الأندلس للعلوم الطبية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Scheduling OLTP Transactions via Machine Learning

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً