ترغب بنشر مسار تعليمي؟ اضغط هنا

Model Counting of Query Expressions: Limitations of Propositional Methods

189   0   0.0 ( 0 )
 نشر من قبل Sudeepa Roy
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Query evaluation in tuple-independent probabilistic databases is the problem of computing the probability of an answer to a query given independent probabilities of the individual tuples in a database instance. There are two main approaches to this problem: (1) in `grounded inference one first obtains the lineage for the query and database instance as a Boolean formula, then performs weighted model counting on the lineage (i.e., computes the probability of the lineage given probabilities of its independent Boolean variables); (2) in methods known as `lifted inference or `extensional query evaluation, one exploits the high-level structure of the query as a first-order formula. Although it is widely believed that lifted inference is strictly more powerful than grounded inference on the lineage alone, no formal separation has previously been shown for query evaluation. In this paper we show such a formal separation for the first time. We exhibit a class of queries for which model counting can be done in polynomial time using extensional query evaluation, whereas the algorithms used in state-of-the-art exact model counters on their lineages provably require exponential time. Our lower bounds on the running times of these exact model counters follow from new exponential size lower bounds on the kinds of d-DNNF representations of the lineages that these model counters (either explicitly or implicitly) produce. Though some of these queries have been studied before, no non-trivial lower bounds on the sizes of these representations for these queries were previously known.



قيم البحث

اقرأ أيضاً

We study counting propositional logic as an extension of propositional logic with counting quantifiers. We prove that the complexity of the underlying decision problem perfectly matches the appropriate level of Wagners counting hierarchy, but also th at the resulting logic admits a satisfactory proof-theoretical treatment. From the latter, a type system for a probabilistic lambda-calculus is derived in the spirit of the Curry-Howard correspondence, showing the potential of counting propositional logic as a useful tool in several fields of theoretical computer science.
154 - Batya Kenig , Dan Suciu 2020
We study the $generalized~model~counting~problem$, defined as follows: given a database, and a set of deterministic tuples, count the number of subsets of the database that include all deterministic tuples and satisfy the query. This problem is compu tationally equivalent to the evaluation of the query over a tuple-independent probabilistic database where all tuples have probabilities in ${0,frac{1}{2},1}$. Previous work has established a dichotomy for Unions of Conjunctive Queries (UCQ) when the probabilities are arbitrary rational numbers, showing that, for each query, its complexity is either in polynomial time or #P-hard. The query is called $safe$ in the first case, and $unsafe$ in the second case. Here, we strengthen the hardness proof, by proving that an unsafe UCQ query remains #P-hard even if the probabilities are restricted to ${0,frac{1}{2},1}$. This requires a complete redesign of the hardness proof, using new techniques. A related problem is the $model~counting~problem$, which asks for the probability of the query when the input probabilities are restricted to ${0,frac{1}{2}}$. While our result does not extend to model counting for all unsafe UCQs, we prove that model counting is #P-hard for a class of unsafe queries called Type-I forbidden queries.
We investigate the computational complexity of minimizing the source side-effect in order to remove a given number of tuples from the output of a conjunctive query. In particular, given a multi-relational database $D$, a conjunctive query $Q$, and a positive integer $k$ as input, the goal is to find a minimum subset of input tuples to remove from D that would eliminate at least $k$ output tuples from $Q(D)$. This problem generalizes the well-studied deletion propagation problem in databases. In addition, it encapsulates the notion of intervention for aggregate queries used in data analysis with applications to explaining interesting observations on the output. We show a dichotomy in the complexity of this problem for the class of full conjunctive queries without self-joins by giving a characterization on the structure of $Q$ that makes the problem either polynomial-time solvable or NP-hard. Our proof of this dichotomy result already gives an exact algorithm in the easy cases; we complement this by giving an approximation algorithm for the hard cases of the problem.
We investigate the computational complexity of minimizing the source side-effect in order to remove a given number of tuples from the output of a conjunctive query. This is a variant of the well-studied {em deletion propagation} problem, the differen ce being that we are interested in removing the smallest subset of input tuples to remove a given number of output tuples} while deletion propagation focuses on removing a specific output tuple. We call this the {em Aggregated Deletion Propagation} problem. We completely characterize the poly-time solvability of this problem for arbitrary conjunctive queries without self-joins. This includes a poly-time algorithm to decide solvability, as well as an exact structural characterization of NP-hard instances. We also provide a practical algorithm for this problem (a heuristic for NP-hard instances) and evaluate its experimental performance on real and synthetic datasets.
Arising user-centric graph applications such as route planning and personalized social network analysis have initiated a shift of paradigms in modern graph processing systems towards multi-query analysis, i.e., processing multiple graph queries in pa rallel on a shared graph. These applications generate a dynamic number of localized queries around query hotspots such as popular urban areas. However, existing graph processing systems are not yet tailored towards these properties: The employed methods for graph partitioning and synchronization management disregard query locality and dynamism which leads to high query latency. To this end, we propose the system Q-Graph for multi-query graph analysis that considers query locality on three levels. (i) The query-aware graph partitioning algorithm Q-cut maximizes query locality to reduce communication overhead. (ii) The method for synchronization management, called hybrid barrier synchronization, allows for full exploitation of local queries spanning only a subset of partitions. (iii) Both methods adapt at runtime to changing query workloads in order to maintain and exploit locality. Our experiments show that Q-cut reduces average query latency by up to 57 percent compared to static query-agnostic partitioning algorithms.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا