ﻻ يوجد ملخص باللغة العربية
Recursive query processing has experienced a recent resurgence, as a result of its use in many modern application domains, including data integration, graph analytics, security, program analysis, networking and decision making. Due to the large volumes of data being processed, several research efforts, across multiple communities, have explored how to scale up recursive queries, typically expressed in Datalog. Our experience with these tools indicated that their performance does not translate across domains (e.g., a tool design for large-scale graph analytics does not exhibit the same performance on program-analysis tasks, and vice versa). As a result, we designed and implemented a general-purpose Datalog engine, called RecStep, on top of a parallel single-node relational system. In this paper, we outline the different techniques we use in RecStep, and the contribution of each technique to overall performance. We also present results from a detailed set of experiments comparing RecStep with a number of other Datalog systems using both graph analytics and program-analysis tasks, summarizing pros and cons of existing techniques based on the analysis of our observations. We show that RecStep generally outperforms the state-of-the-art parallel Datalog engines on complex and large-scale Datalog program evaluation, by a 4-6X margin. An additional insight from our work is that we show that it is possible to build a high-performance Datalog system on top of a relational engine, an idea that has been dismissed in past work in this area.
DGCC protocol has been shown to achieve good performance on multi-core in-memory system. However, distributed transactions complicate the dependency resolution, and therefore, an effective transaction partitioning strategy is essential to reduce expe
The scaling laws of the achievable communication rates and the corresponding upper bounds of distributed reception in the presence of an interfering signal are investigated. The scheme includes one transmitter communicating to a remote destination vi
Arguing for the need to combine declarative and probabilistic programming, Barany et al. (TODS 2017) recently introduced a probabilistic extension of Datalog as a purely declarative probabilistic programming language. We revisit this language and pro
Recursive queries have been traditionally studied in the framework of datalog, a language that restricts recursion to monotone queries over sets, which is guaranteed to converge in polynomial time in the size of the input. But modern big data systems
Materialisation is often used in RDF systems as a preprocessing step to derive all facts implied by given RDF triples and rules. Although widely used, materialisation considers all possible rule applications and can use a lot of memory for storing th