No Arabic abstract
XML database query languages have been studied extensively, but XML database updates have received relatively little attention, and pose many challenges to language design. We are developing an XML update language called Flux, which stands for FunctionaL Updates for XML, drawing upon ideas from functional programming languages. In prior work, we have introduced a core language for Flux with a clear operational semantics and a sound, decidable static type system based on regular expression types. Our initial proposal had several limitations. First, it lacked support for recursive types or update procedures. Second, although a high-level source language can easily be translated to the core language, it is difficult to propagate meaningful type errors from the core language back to the source. Third, certain updates are well-formed yet contain path errors, or ``dead subexpressions which never do any useful work. It would be useful to detect path errors, since they often represent errors or optimization opportunities. In this paper, we address all three limitations. Specifically, we present an improved, sound type system that handles recursion. We also formalize a source update language and give a translation to the core language that preserves and reflects typability. We also develop a path-error analysis (a form of dead-code analysis) for updates.
This thesis describes the theoretical and practical foundations of a system for the static analysis of XML processing languages. The system relies on a fixpoint temporal logic with converse, derived from the mu-calculus, where models are finite trees. This calculus is expressive enough to capture regular tree types along with multi-directional navigation in trees, while having a single exponential time complexity. Specifically the decidability of the logic is proved in time 2^O(n) where n is the size of the input formula. Major XML concepts are linearly translated into the logic: XPath navigation and node selection semantics, and regular tree languages (which include DTDs and XML Schemas). Based on these embeddings, several problems of major importance in XML applications are reduced to satisfiability of the logic. These problems include XPath containment, emptiness, equivalence, overlap, coverage, in the presence or absence of regular tree type constraints, and the static type-checking of an annotated query. The focus is then given to a sound and complete algorithm for deciding the logic, along with a detailed complexity analysis, and crucial implementation techniques for building an effective solver. Practical experiments using a full implementation of the system are presented. The system appears to be efficient in practice for several realistic scenarios. The main application of this work is a new class of static analyzers for programming languages using both XPath expressions and XML type annotations (input and output). Such analyzers allow to ensure at compile-time valuable properties such as type-safety and optimizations, for safer and more efficient XML processing.
This document describes how to use the XML static analyzer in practice. It provides informal documentation for using the XML reasoning solver implementation. The solver allows automated verification of properties that are expressed as logical formulas over trees. A logical formula may for instance express structural constraints or navigation properties (like e.g. path existence and node selection) in finite trees. Logical formulas can be expressed using the syntax of XPath expressions, DTD, XML Schemas, and Relax NG definitions.
XML database query languages such as XQuery employ regular expression types with structural subtyping. Subtyping systems typically have two presentations, which should be equivalent: a declarative version in which the subsumption rule may be used anywhere, and an algorithmic version in which the use of subsumption is limited in order to make typechecking syntax-directed and decidable. However, the XQuery standard type system circumvents this issue by using imprecise typing rules for iteration constructs and defining only algorithmic typechecking, and another extant proposal provides more precise types for iteration constructs but ignores subtyping. In this paper, we consider a core XQuery-like language with a subsumption rule and prove the completeness of algorithmic typechecking; this is straightforward for XQuery proper but requires some care in the presence of more precise iteration typing disciplines. We extend this result to an XML update language we have introduced in earlier work.
In modern application areas for software systems --- like eHealth, the Internet-of-Things, and Edge Computing --- data is encoded in heterogeneous, tree-shaped data-formats, it must be processed in real-time, and it must be ephemeral, i.e., not persist in the system. While it is preferable to use a query language to express complex data-handling logic, their typical execution engine, a database external from the main application, is unfit in scenarios of ephemeral data-handling. A better option is represented by integrated query frameworks, which benefit from existing development support tools (e.g., syntax and type checkers) and execute within the application memory. In this paper, we propose one such framework that, for the first time, targets tree-shaped, document-oriented queries. We formalise an instantiation of MQuery, a sound variant of the widely-used MongoDB query language, which we implemented in the Jolie language. Jolie programs are microservices, the building blocks of modern software systems. Moreover, since Jolie supports native tree data-structures and automatic management of heterogeneous data-encodings, we can provide a uniform way to use MQuery on any data-format supported by the language. We present a non-trivial use case from eHealth, use it to concretely evaluate our model, and to illustrate our formalism.
During the life cycle of an XML application, both schemas and queries may change from one version to another. Schema evolutions may affect query results and potentially the validity of produced data. Nowadays, a challenge is to assess and accommodate the impact of theses changes in rapidly evolving XML applications. This article proposes a logical framework and tool for verifying forward/backward compatibility issues involving schemas and queries. First, it allows analyzing relations between schemas. Second, it allows XML designers to identify queries that must be reformulated in order to produce the expected results across successive sche