ترغب بنشر مسار تعليمي؟ اضغط هنا

LevelHeaded: Making Worst-Case Optimal Joins Work in the Common Case

74   0   0.0 ( 0 )
 نشر من قبل Christopher Aberger
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Pipelines combining SQL-style business intelligence (BI) queries and linear algebra (LA) are becoming increasingly common in industry. As a result, there is a growing need to unify these workloads in a single framework. Unfortunately, existing solutions either sacrifice the inherent benefits of exclusively using a relational database (e.g. logical and physical independence) or incur orders of magnitude performance gaps compared to specialized engines (or both). In this work we study applying a new type of query processing architecture to standard BI and LA benchmarks. To do this we present a new in-memory query processing engine called LevelHeaded. LevelHeaded uses worst-case optimal joins as its core execution mechanism for both BI and LA queries. With LevelHeaded, we show how crucial optimizations for BI and LA queries can be captured in a worst-case optimal query architecture. Using these optimizations, LevelHeaded outperforms other relational database engines (LogicBlox, MonetDB, and HyPer) by orders of magnitude on standard LA benchmarks, while performing on average within 31% of the best-of-breed BI (HyPer) and LA (Intel MKL) solutions on their own benchmarks. Our results show that such a single query processing architecture is capable of delivering competitive performance on both BI and LA queries.



قيم البحث

اقرأ أيضاً

We consider the problem of incrementally maintaining the triangle count query under single-tuple updates to the input relations. We introduce an approach that exhibits a space-time tradeoff such that the space-time product is quadratic in the size of the input database and the update time can be as low as the square root of this size. This lowest update time is worst-case optimal conditioned on the Online Matrix-Vector Multiplication conjecture. The classical and factorized incremental view maintenance approaches are recovered as special cases of our approach within the space-time tradeoff. In particular, they require linear-time update maintenance, which is suboptimal. Our approach also recovers the worst-case optimal time complexity for computing the triangle count in the non-incremental setting.
We derive an equality for non-equilibrium statistical mechanics in finite-dimensional quantum systems. The equality concerns the worst-case work output of a time-dependent Hamiltonian protocol in the presence of a Markovian heat bath. It has has the form worst-case work = penalty - optimum. The equality holds for all rates of changing the Hamiltonian and can be used to derive the optimum by setting the penalty to 0. The optimum term contains the max entropy of the initial state, rather than the von Neumann entropy, thus recovering recent results from single-shot statistical mechanics. Energy coherences can arise during the protocol but are assumed not to be present initially. We apply the equality to an electron box.
We provide the solution for a fundamental problem of geometric optimization by giving a complete characterization of worst-case optimal disk coverings of rectangles: For any $lambdageq 1$, the critical covering area $A^*(lambda)$ is the minimum value for which any set of disks with total area at least $A^*(lambda)$ can cover a rectangle of dimensions $lambdatimes 1$. We show that there is a threshold value $lambda_2 = sqrt{sqrt{7}/2 - 1/4} approx 1.035797ldots$, such that for $lambda<lambda_2$ the critical covering area $A^*(lambda)$ is $A^*(lambda)=3pileft(frac{lambda^2}{16} +frac{5}{32} + frac{9}{256lambda^2}right)$, and for $lambdageq lambda_2$, the critical area is $A^*(lambda)=pi(lambda^2+2)/4$; these values are tight. For the special case $lambda=1$, i.e., for covering a unit square, the critical covering area is $frac{195pi}{256}approx 2.39301ldots$. The proof uses a careful combination of manual and automatic analysis, demonstrating the power of the employed interval arithmetic technique.
We provide a tight result for a fundamental problem arising from packing disks into a circular container: The critical density of packing disks in a disk is 0.5. This implies that any set of (not necessarily equal) disks of total area $deltaleq 1/2$ can always be packed into a disk of area 1; on the other hand, for any $varepsilon>0$ there are sets of disks of area $1/2+varepsilon$ that cannot be packed. The proof uses a careful manual analysis, complemented by a minor automatic part that is based on interval arithmetic. Beyond the basic mathematical importance, our result is also useful as a blackbox lemma for the analysis of recursive packing algorithms.
Understanding the semantics of tables at scale is crucial for tasks like data integration, preparation, and search. Table understanding methods aim at detecting a tables topic, semantic column types, column relations, or entities. With the rise of de ep learning, powerful models have been developed for these tasks with excellent accuracy on benchmarks. However, we observe that there exists a gap between the performance of these models on these benchmarks and their applicability in practice. In this paper, we address the question: what do we need for these models to work in practice? We discuss three challenges of deploying table understanding models and propose a framework to address them. These challenges include 1) difficulty in customizing models to specific domains, 2) lack of training data for typical database tables often found in enterprises, and 3) lack of confidence in the inferences made by models. We present SigmaTyper which implements this framework for the semantic column type detection task. SigmaTyper encapsulates a hybrid model trained on GitTables and integrates a lightweight human-in-the-loop approach to customize the model. Lastly, we highlight avenues for future research that further close the gap towards making table understanding effective in practice.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا