ترغب بنشر مسار تعليمي؟ اضغط هنا

Aggregated Deletion Propagation for Counting Conjunctive Query Answers

256   0   0.0 ( 0 )
 نشر من قبل Xiao Hu
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We investigate the computational complexity of minimizing the source side-effect in order to remove a given number of tuples from the output of a conjunctive query. This is a variant of the well-studied {em deletion propagation} problem, the difference being that we are interested in removing the smallest subset of input tuples to remove a given number of output tuples} while deletion propagation focuses on removing a specific output tuple. We call this the {em Aggregated Deletion Propagation} problem. We completely characterize the poly-time solvability of this problem for arbitrary conjunctive queries without self-joins. This includes a poly-time algorithm to decide solvability, as well as an exact structural characterization of NP-hard instances. We also provide a practical algorithm for this problem (a heuristic for NP-hard instances) and evaluate its experimental performance on real and synthetic datasets.



قيم البحث

اقرأ أيضاً

We investigate the computational complexity of minimizing the source side-effect in order to remove a given number of tuples from the output of a conjunctive query. In particular, given a multi-relational database $D$, a conjunctive query $Q$, and a positive integer $k$ as input, the goal is to find a minimum subset of input tuples to remove from D that would eliminate at least $k$ output tuples from $Q(D)$. This problem generalizes the well-studied deletion propagation problem in databases. In addition, it encapsulates the notion of intervention for aggregate queries used in data analysis with applications to explaining interesting observations on the output. We show a dichotomy in the complexity of this problem for the class of full conjunctive queries without self-joins by giving a characterization on the structure of $Q$ that makes the problem either polynomial-time solvable or NP-hard. Our proof of this dichotomy result already gives an exact algorithm in the easy cases; we complement this by giving an approximation algorithm for the hard cases of the problem.
We consider here the problem of obtaining reliable, consistent information from inconsistent databases -- databases that do not have to satisfy given integrity constraints. We use the notion of consistent query answer -- a query answer which is true in every (minimal) repair of the database. We provide a complete classification of the computational complexity of consistent answers to first-order queries w.r.t. functional dependencies and denial constraints. We show how the complexity depends on the {em type} of the constraints considered, their {em number}, and the {em size} of the query. We obtain several new PTIME cases, using new algorithms.
Query containment and query answering are two important computational tasks in databases. While query answering amounts to compute the result of a query over a database, query containment is the problem of checking whether for every database, the res ult of one query is a subset of the result of another query. In this paper, we deal with unions of conjunctive queries, and we address query containment and query answering under Description Logic constraints. Every such constraint is essentially an inclusion dependencies between concepts and relations, and their expressive power is due to the possibility of using complex expressions, e.g., intersection and difference of relations, special forms of quantification, regular expressions over binary relations, in the specification of the dependencies. These types of constraints capture a great variety of data models, including the relational, the entity-relationship, and the object-oriented model, all extended with various forms of constraints, and also the basic features of the ontology languages used in the context of the Semantic Web. We present the following results on both query containment and query answering. We provide a method for query containment under Description Logic constraints, thus showing that the problem is decidable, and analyze its computational complexity. We prove that query containment is undecidable in the case where we allow inequalities in the right-hand side query, even for very simple constraints and queries. We show that query answering under Description Logic constraints can be reduced to query containment, and illustrate how such a reduction provides upper bound results with respect to both combined and data complexity.
The framework of consistent query answers and repairs has been introduced to alleviate the impact of inconsistent data on the answers to a query. A repair is a minimally different consistent instance and an answer is consistent if it is present in ev ery repair. In this article we study the complexity of consistent query answers and repair checking in the presence of universal constraints. We propose an extended version of the conflict hypergraph which allows to capture all repairs w.r.t. a set of universal constraints. We show that repair checking is in PTIME for the class of full tuple-generating dependencies and denial constraints, and we present a polynomial repair algorithm. This algorithm is sound, i.e. always produces a repair, but also complete, i.e. every repair can be constructed. Next, we present a polynomial-time algorithm computing consistent answers to ground quantifier-free queries in the presence of denial constraints, join dependencies, and acyclic full-tuple generating dependencies. Finally, we show that extending the class of constraints leads to intractability. For arbitrary full tuple-generating dependencies consistent query answering becomes coNP-complete. For arbitrary universal constraints consistent query answering is Pi_2^p-complete and repair checking coNP-complete.
154 - Batya Kenig , Dan Suciu 2020
We study the $generalized~model~counting~problem$, defined as follows: given a database, and a set of deterministic tuples, count the number of subsets of the database that include all deterministic tuples and satisfy the query. This problem is compu tationally equivalent to the evaluation of the query over a tuple-independent probabilistic database where all tuples have probabilities in ${0,frac{1}{2},1}$. Previous work has established a dichotomy for Unions of Conjunctive Queries (UCQ) when the probabilities are arbitrary rational numbers, showing that, for each query, its complexity is either in polynomial time or #P-hard. The query is called $safe$ in the first case, and $unsafe$ in the second case. Here, we strengthen the hardness proof, by proving that an unsafe UCQ query remains #P-hard even if the probabilities are restricted to ${0,frac{1}{2},1}$. This requires a complete redesign of the hardness proof, using new techniques. A related problem is the $model~counting~problem$, which asks for the probability of the query when the input probabilities are restricted to ${0,frac{1}{2}}$. While our result does not extend to model counting for all unsafe UCQs, we prove that model counting is #P-hard for a class of unsafe queries called Type-I forbidden queries.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا