No Arabic abstract
Since the proof of the four color theorem in 1976, computer-generated proofs have become a reality in mathematics and computer science. During the last decade, we have seen formal proofs using verified proof assistants being used to verify the validity of such proofs. In this paper, we describe a formalized theory of size-optimal sorting networks. From this formalization we extract a certified checker that successfully verifies computer-generated proofs of optimality on up to 8 inputs. The checker relies on an untrusted oracle to shortcut the search for witnesses on more than 1.6 million NP-complete subproblems.
In recent work, we formalized the theory of optimal-size sorting networks with the goal of extracting a verified checker for the large-scale computer-generated proof that 25 comparisons are optimal when sorting 9 inputs, which required more than a decade of CPU time and produced 27 GB of proof witnesses. The checker uses an untrusted oracle based on these witnesses and is able to verify the smaller case of 8 inputs within a couple of days, but it did not scale to the full proof for 9 inputs. In this paper, we describe several non-trivial optimizations of the algorithm in the checker, obtained by appropriately changing the formalization and capitalizing on the symbiosis with an adequate implementation of the oracle. We provide experimental evidence of orders of magnitude improvements to both runtime and memory footprint for 8 inputs, and actually manage to check the full proof for 9 inputs.
This extended abstract reports on current progress of SMTCoq, a communication tool between the Coq proof assistant and external SAT and SMT solvers. Based on a checker for generic first-order certificates implemented and proved correct in Coq, SMTCoq offers facilities both to check external SAT and SMT answers and to improve Coqs automation using such solvers, in a safe way. Currently supporting the SAT solver zChaff, and the SMT solver veriT for the combination of the theories of congruence closure and linear integer arithmetic, SMTCoq is meant to be extendable with a reasonable amount of effort: we present work in progress to support the SMT solver CVC4 and the theory of bit vectors.
Using multisets, we develop novel techniques for mechanizing the proofs of the synthesis conjectures for list-sorting algorithms, and we demonstrate them in the Theorema system. We use the classical principle of extracting the algorithm as a set of rewrite rules based on the witnesses found in the proof of the synthesis conjecture produced from the specification of the desired function (input and output conditions). The proofs are in natural style, using standard rules, but most importantly domain specific inference rules and strategies. In particular the use of multisets allows us to develop powerful strategies for the synthesis of arbitrarily structured recursive algorithms by general Noetherian induction, as well as for the automatic generation of the specifications of all necessary auxiliary functions (insert, merge, split), whose synthesis is performed using the same method.
Fault-tolerant distributed systems move the trust in a single party to a majority of parties participating in the protocol. This makes blockchain based crypto-currencies possible: they allow parties to agree on a total order of transactions without a trusted third party. To trust a distributed system, the security of the protocol and the correctness of the implementation must be indisputable. We present the first machine checked proof that guarantees both safety and liveness for a consensus algorithm. We verify a Proof of Stake (PoS) Nakamoto-style blockchain (NSB) protocol, using the foundational proof assistant Coq. In particular, we consider a PoS NSB in a synchronous network with a static set of corrupted parties. We define execution semantics for this setting and prove chain growth, chain quality, and common prefix which together imply both safety and liveness.
Ensemble trees are a popular machine learning model which often yields high prediction performance when analysing structured data. Although individual small decision trees are deemed explainable by nature, an ensemble of large trees is often difficult to understand. In this work, we propose an approach called optimised explanation (OptExplain) that faithfully extracts global explanations of ensemble trees using a combination of logical reasoning, sampling and optimisation. Building on top of this, we propose a method called the profile of equivalent classes (ProClass), which uses MAX-SAT to simplify the explanation even further. Our experimental study on several datasets shows that our approach can provide high-quality explanations to large ensemble trees models, and it betters recent top-performers.