No Arabic abstract
Constraint Handling Rules (CHR) is a committed-choice declarative language which has been originally designed for writing constraint solvers and which is nowadays a general purpose language. CHR programs consist of multi-headed guarded rules which allow to rewrite constraints into simpler ones until a solved form is reached. Many empirical evidences suggest that multiple heads augment the expressive power of the language, however no formal result in this direction has been proved, so far. In the first part of this paper we analyze the Turing completeness of CHR with respect to the underneath constraint theory. We prove that if the constraint theory is powerful enough then restricting to single head rules does not affect the Turing completeness of the language. On the other hand, differently from the case of the multi-headed language, the single head CHR language is not Turing powerful when the underlying signature (for the constraint theory) does not contain function symbols. In the second part we prove that, no matter which constraint theory is considered, under some reasonable assumptions it is not possible to encode the CHR language (with multi-headed rules) into a single headed language while preserving the semantics of the programs. We also show that, under some stronger assumptions, considering an increasing number of atoms in the head of a rule augments the expressive power of the language. These results provide a formal proof for the claim that multiple heads augment the expressive power of the CHR language.
This article shows that there exist two particular linear orders such that first-order logic with these two linear orders has the same expressive power as first-order logic with the Bit-predicate FO(Bit). As a corollary we obtain that there also exists a built-in permutation such that first-order logic with a linear order and this permutation is as expressive as FO(Bit).
We investigate the expressive power of the two main kinds of program logics for complex, non-regular program properties found in the literature: those extending propositional dynamic logic (PDL), and those extending the modal mu-calculus. This is inspired by the recent discovery of a decidable program logic called Visibly Pushdown Fixpoint Logic with Chop which extends both the modal mu-calculus and PDL over visibly pushdown languages, which, so far, constituted the ends of two pillars of decidable program logics. Here we show that this logic is not only more expressive than either of its two fragments, but in fact even more expressive than their union. Hence, the decidability border amongst program logics has been properly pushed up. We complete the picture by providing results separating all the PDL-based and modal fixpoint logics with regular, visibly pushdown and arbitrary context-free constructions.
Data streams occur widely in various real world applications. The research on streaming data mainly focuses on the data management, query evaluation and optimization on these data, however the work on reasoning procedures for streaming knowledge bases on both the assertional and terminological levels is very limited. Typically reasoning services on large knowledge bases are very expensive, and need to be applied continuously when the data is received as a stream. Hence new techniques for optimizing this continuous process is needed for developing efficient reasoners on streaming data. In this paper, we survey the related research on reasoning on expressive logics that can be applied to this setting, and point to further research directions in this area.
We study the decidability of termination for two CHR dialects which, similarly to the Datalog like languages, are defined by using a signature which does not allow function symbols (of arity >0). Both languages allow the use of the = built-in in the body of rules, thus are built on a host language that supports unification. However each imposes one further restriction. The first CHR dialect allows only range-restricted rules, that is, it does not allow the use of variables in the body or in the guard of a rule if they do not appear in the head. We show that the existence of an infinite computation is decidable for this dialect. The second dialect instead limits the number of atoms in the head of rules to one. We prove that in this case, the existence of a terminating computation is decidable. These results show that both dialects are strictly less expressive than Turing Machines. It is worth noting that the language (without function symbols) without these restrictions is as expressive as Turing Machines.
Transformer networks are able to capture patterns in data coming from many domains (text, images, videos, proteins, etc.) with little or no change to architecture components. We perform a theoretical analysis of the core component responsible for signal propagation between elements, i.e. the self-attention matrix. In practice, this matrix typically exhibits two properties: (1) it is sparse, meaning that each token only attends to a small subset of other tokens; and (2) it changes dynamically depending on the input to the module. With these considerations in mind, we ask the following question: Can a fixed self-attention module approximate arbitrary sparse patterns depending on the input? How small is the hidden size $d$ required for such approximation? We make progress in answering this question and show that the self-attention matrix can provably approximate sparse matrices, where sparsity is in terms of a bounded number of nonzero elements in each row and column. While the parameters of self-attention are fixed, various sparse matrices can be approximated by only modifying the inputs. Our proof is based on the random projection technique and uses the seminal Johnson-Lindenstrauss lemma. Our proof is constructive, enabling us to propose an algorithm for finding adaptive inputs and fixed self-attention parameters in order to approximate a given matrix. In particular, we show that, in order to approximate any sparse matrix up to a given precision defined in terms of preserving matrix element ratios, $d$ grows only logarithmically with the sequence length $L$ (i.e. $d = O(log L)$).