No Arabic abstract
There is increasing interest in applying verification tools to programs that have bitvector operations (eg., binaries). SMT solvers, which serve as a foundation for these tools, have thus increased support for bitvector reasoning through bit-blasting and linear arithmetic approximations. In this paper we show that similar linear arithmetic approximation of bitvector operations can be done at the source level through transformations. Specifically, we introduce new paths that over-approximate bitvector operations with linear conditions/constraints, increasing branching but allowing us to better exploit the well-developed integer reasoning and interpolation of verification tools. We show that, for reachability of bitvector programs, increased branching incurs negligible overhead yet, when combined with integer interpolation optimizations, enables more programs to be verified. We further show this exploitation of integer interpolation in the common case also enables competitive termination verification of bitvector programs and leads to the first effective technique for LTL verification of bitvector programs. Finally, we provide an in-depth case study of decompiled (lifted) binary programs, which emulate X86 execution through frequent use of bitvector operations. We present a new tool DarkSea, the first tool capable of verifying reachability, termination, and LTL of lifted binaries.
The extension of classical imperative programs with real-valued random variables and random branching gives rise to probabilistic programs. The termination problem is one of the most fundamental liveness properties for such programs. The qualitative (aka almost-sure) termination problem asks whether a given program terminates with probability 1. Ranking functions provide a sound and complete approach for termination of non-probabilistic programs, and their extension to probabilistic programs is achieved via ranking supermartingales (RSMs). RSMs have been extended to lexicographic RSMs to handle programs with involved control-flow structure, as well as for compositional approach. There are two key limitations of the existing RSM-based approaches: First, the lexicographic RSM-based approach requires a strong nonnegativity assumption, which need not always be satisfied. The second key limitation of the existing RSM-based algorithmic approaches is that they rely on pre-computed invariants. The main drawback of relying on pre-computed invariants is the insufficiency-inefficiency trade-off: weak invariants might be insufficient for RSMs to prove termination, while using strong invariants leads to inefficiency in computing them. Our contributions are twofold: First, we show how to relax the strong nonnegativity condition and still provide soundness guarantee for almost-sure termination. Second, we present an incremental approach where the process of computing lexicographic RSMs proceeds by iterative pruning of parts of the program that were already shown to be terminating, in cooperation with a safety prover. In particular, our technique does not rely on strong pre-computed invariants. We present experimental results to show the applicability of our approach to examples of probabilistic programs from the literature.
The notion of program sensitivity (aka Lipschitz continuity) specifies that changes in the program input result in proportional changes to the program output. For probabilistic programs the notion is naturally extended to expected sensitivity. A previous approach develops a relational program logic framework for proving expected sensitivity of probabilistic while loops, where the number of iterations is fixed and bounded. In this work, we consider probabilistic while loops where the number of iterations is not fixed, but randomized and depends on the initial input values. We present a sound approach for proving expected sensitivity of such programs. Our sound approach is martingale-based and can be automated through existing martingale-synthesis algorithms. Furthermore, our approach is compositional for sequential composition of while loops under a mild side condition. We demonstrate the effectiveness of our approach on several classical examples from Gamblers Ruin, stochastic hybrid systems and stochastic gradient descent. We also present experimental results showing that our automated approach can handle various probabilistic programs in the literature.
Reactive synthesis is a key technique for the design of correct-by-construction systems and has been thoroughly investigated in the last decades. It consists in the synthesis of a controller that reacts to environments inputs satisfying a given temporal logic specification. Common approaches are based on the explicit construction of automata and on their determinization, which limit their scalability. In this paper, we introduce a new fragment of Linear Temporal Logic, called Extended Bounded Response LTL (LTLEBR), that allows one to combine bounded and universal unbounded temporal operators (thus covering a large set of practical cases), and we show that reactive synthesis from LTLEBR specifications can be reduced to solving a safety game over a deterministic symbolic automaton built directly from the specification. We prove the correctness of the proposed approach and we successfully evaluate it on various benchmarks.
Semantic understanding of programs is a fundamental problem for programming language processing (PLP). Recent works that learn representations of code based on pre-training techniques in NLP have pushed the frontiers in this direction. However, the semantics of PL and NL have essential differences. These being ignored, we believe it is difficult to build a model to better understand programs, by either directly applying off-the-shelf NLP pre-training techniques to the source code, or adding features to the model by the heuristic. In fact, the semantics of a program can be rigorously defined by formal semantics in PL theory. For example, the operational semantics, describes the meaning of a valid program as updating the environment (i.e., the memory address-value function) through fundamental operations, such as memory I/O and conditional branching. Inspired by this, we propose a novel program semantics learning paradigm, that the model should learn from information composed of (1) the representations which align well with the fundamental operations in operational semantics, and (2) the information of environment transition, which is indispensable for program understanding. To validate our proposal, we present a hierarchical Transformer-based pre-training model called OSCAR to better facilitate the understanding of programs. OSCAR learns from intermediate representation (IR) and an encoded representation derived from static analysis, which are used for representing the fundamental operations and approximating the environment transitions respectively. OSCAR empirically shows the outstanding capability of program semantics understanding on many practical software engineering tasks.
Quantum computation is a topic of significant recent interest, with practical advances coming from both research and industry. A major challenge in quantum programming is dealing with errors (quantum noise) during execution. Because quantum resources (e.g., qubits) are scarce, classical error correction techniques applied at the level of the architecture are currently cost-prohibitive. But while this reality means that quantum programs are almost certain to have errors, there as yet exists no principled means to reason about erroneous behavior. This paper attempts to fill this gap by developing a semantics for erroneous quantum while-programs, as well as a logic for reasoning about them. This logic permits proving a property we have identified, called $epsilon$-robustness, which characterizes possible distance between an ideal program and an erroneous one. We have proved the logic sound, and showed its utility on several case studies, notably: (1) analyzing the robustness of noi