No Arabic abstract
In the last decade, substantial progress has been made towards standardizing the syntax of graph query languages, and towards understanding their semantics and complexity of evaluation. In this paper, we consider temporal property graphs (TPGs) and propose temporal regular path queries (TRPQ) that incorporate time into TPGs navigation. Starting with design principles, we propose a natural syntactic extension of the MATCH clause of popular graph query languages. We then formally present the semantics of TRPQs, and study the complexity of their evaluation. We show that TRPQs can be evaluated in polynomial time if TPGs are time-stamped with time points. We also identify fragments of the TRPQ language that admit efficient evaluation over a more succinct interval-annotated representation. Our work on the syntax, and the positive complexity results, pave the way to implementations of TRPQs that are both usable and practical.
A temporal graph is a graph in which vertices communicate with each other at specific time, e.g., $A$ calls $B$ at 11 a.m. and talks for 7 minutes, which is modeled by an edge from $A$ to $B$ with starting time 11 a.m. and duration 7 mins. Temporal graphs can be used to model many networks with time-related activities, but efficient algorithms for analyzing temporal graphs are severely inadequate. We study fundamental problems such as answering reachability and time-based path queries in a temporal graph, and propose an efficient indexing technique specifically designed for processing these queries in a temporal graph. Our results show that our method is efficient and scalable in both index construction and query processing.
We study here fundamental issues involved in top-k query evaluation in probabilistic databases. We consider simple probabilistic databases in which probabilities are associated with individual tuples, and general probabilistic databases in which, additionally, exclusivity relationships between tuples can be represented. In contrast to other recent research in this area, we do not limit ourselves to injective scoring functions. We formulate three intuitive postulates that the semantics of top-k queries in probabilistic databases should satisfy, and introduce a new semantics, Global-Topk, that satisfies those postulates to a large degree. We also show how to evaluate queries under the Global-Topk semantics. For simple databases we design dynamic-programming based algorithms, and for general databases we show polynomial-time reductions to the simple cases. For example, we demonstrate that for a fixed k the time complexity of top-k query evaluation is as low as linear, under the assumption that probabilistic databases are simple and scoring functions are injective.
Structural indexing is an approach to accelerating query evaluation, whereby data objects are partitioned and indexed reflecting the precise expressive power of a given query language. Each partition block of the index holds exactly those objects that are indistinguishable with respect to queries expressible in the language. Structural indexes have proven successful for XML, RDF, and relational data management. In this paper we study structural indexing for conjunctive path queries (CPQ). CPQ forms the core of contemporary graph query languages such as SPARQL, Cypher, PGQL, and G-CORE. CPQ plays the same fundamental role with respect to contemporary graph query languages as the classic conjunctive queries play for SQL. We develop the first practical structural indexes for this important query language. In particular, we propose a structural index based on k-path-bisimulation, tightly coupled to the expressive power of CPQ, and develop algorithms for efficient query processing with our index. Furthermore, we study workload-aware structural indexes to reduce both the construction and space costs according to a given workload. We demonstrate through extensive experiments using real and synthetic graphs that our methods accelerate query processing by up to multiple orders of magnitude over the state-of-the-art methods, without increasing index size.
We give solutions to two fundamental computational problems in ontology-based data access with the W3C standard ontology language OWL 2 QL: the succinctness problem for first-order rewritings of ontology-mediated queries (OMQs), and the complexity problem for OMQ answering. We classify OMQs according to the shape of their conjunctive queries (treewidth, the number of leaves) and the existential depth of their ontologies. For each of these classes, we determine the combined complexity of OMQ answering, and whether all OMQs in the class have polynomial-size first-order, positive existential, and nonrecursive datalog rewritings. We obtain the succinctness results using hypergraph programs, a new computational model for Boolean functions, which makes it possible to connect the size of OMQ rewritings and circuit complexity.
Two-way regular path queries (2RPQs) have received increased attention recently due to their ability to relate pairs of objects by flexibly navigating graph-structured data. They are present in property paths in SPARQL 1.1, the new standard RDF query language, and in the XML query language XPath. In line with XPath, we consider the extension of 2RPQs with nesting, which allows one to require that objects along a path satisfy complex conditions, in turn expressed through (nested) 2RPQs. We study the computational complexity of answering nested 2RPQs and conjunctions thereof (CN2RPQs) in the presence of domain knowledge expressed in description logics (DLs). We establish tight complexity bounds in data and combined complexity for a variety of DLs, ranging from lightweight DLs (DL-Lite, EL) up to highly expressive ones. Interestingly, we are able to show that adding nesting to (C)2RPQs does not affect worst-case data complexity of query answering for any of the considered DLs. However, in the case of lightweight DLs, adding nesting to 2RPQs leads to a surprising jump in combined complexity, from P-complete to Exp-complete.