No Arabic abstract
Over the last two decades, we have seen a dramatic improvement in the efficiency of conflict-driven clause-learning Boolean satisfiability (CDCL SAT) solvers on industrial problems from a variety of domains. The availability of such powerful general-purpose search tools as SAT solvers has led many researchers to propose SAT-based methods for cryptanalysis, including techniques for finding collisions in hash functions and breaking symmetric encryption schemes. Most of the previously proposed SAT-based cryptanalysis approaches are blackbox techniques, in the sense that the cryptanalysis problem is encoded as a SAT instance and then a CDCL SAT solver is invoked to solve the said instance. A weakness of this approach is that the encoding thus generated may be too large for any modern solver to solve efficiently. Perhaps a more important weakness of this approach is that the solver is in no way specialized or tuned to solve the given instance. To address these issues, we propose an approach called CDCL(Crypto) (inspired by the CDCL(T) paradigm in Satisfiability Modulo Theory solvers) to tailor the internal subroutines of the CDCL SAT solver with domain-specific knowledge about cryptographic primitives. Specifically, we extend the propagation and conflict analysis subroutines of CDCL solvers with specialized codes that have knowledge about the cryptographic primitive being analyzed by the solver. We demonstrate the power of this approach in the differential path and algebraic fault analysis of hash functions. Our initial results are very encouraging and reinforce the notion that this approach is a significant improvement over blackbox SAT-based cryptanalysis.
Current implementations of pseudo-Boolean (PB) solvers working on native PB constraints are based on the CDCL architecture which empowers highly efficient modern SAT solvers. In particular, such PB solvers not only implement a (cutting-planes-based) conflict analysis procedure, but also complementary strategies for components that are crucial for the efficiency of CDCL, namely branching heuristics, learned constraint deletion and restarts. However, these strategies are mostly reused by PB solvers without considering the particular form of the PB constraints they deal with. In this paper, we present and evaluate different ways of adapting CDCL strategies to take the specificities of PB constraints into account while preserving the behavior they have in the clausal setting. We implemented these strategies in two different solvers, namely Sat4j (for which we consider three configurations) and RoundingSat. Our experiments show that these dedicated strategies allow to improve, sometimes significantly, the performance of these solvers, both on decision and optimization problems.
How crypto flows among Bitcoin users is an important question for understanding the structure and dynamics of the cryptoasset at a global scale. We compiled all the blockchain data of Bitcoin from its genesis to the year 2020, identified users from anonymous addresses of wallets, and constructed monthly snapshots of networks by focusing on regular users as big players. We apply the methods of bow-tie structure and Hodge decomposition in order to locate the users in the upstream, downstream, and core of the entire crypto flow. Additionally, we reveal principal components hidden in the flow by using non-negative matrix factorization, which we interpret as a probabilistic model. We show that the model is equivalent to a probabilistic latent semantic analysis in natural language processing, enabling us to estimate the number of such hidden components. Moreover, we find that the bow-tie structure and the principal components are quite stable among those big players. This study can be a solid basis on which one can further investigate the temporal change of crypto flow, entry and exit of big players, and so forth.
Symmetry breaking is a popular technique to reduce the search space for SAT solving by exploiting the underlying symmetry over variables and clauses in a formula. The key idea is to first identify sets of assignments which fall in the same symmetry class, and then impose ordering constraints, called Symmetry Breaking Predicates (SBPs), such that only one (or a small subset) of these assignments is allowed to be a solution of the original SAT formula. While this technique has been exploited extensively in the SAT literature, there is little work on using symmetry breaking for SAT Modulo Theories (SMT). In SMT, logical constraints in SAT theories are combined with another set of theory operations defined over non-Boolean variables such as integers, reals, etc. SMT solvers typically use a combination of SAT solving techniques augmented with calls to the theory solver. In this work, we take up the advances in SAT symmetry breaking and apply them to the domain of SMT. Our key technical contribution is the formulation of symmetry breaking over the Boolean skeleton variables, which are placeholders for actual theory operations in SMT solving. These SBPs are then applied over the SAT solving part of the SMT solver. We implement our SBP ideas on top of CVC4, which is a state-of-the-art SMT solver. Our approach can result in significantly faster solutions on several benchmark problems compared to the state-of-the-art. Our final solver is a hybrid of the original CVC4 solver, and an SBP based solver, and can solve up to 3.8% and 3.1% more problems in the QF_NIA category of 2018 and 2019 SMT benchmarks, respectively, compared to CVC4, the top performer in this category.
CNF-based SAT and MaxSAT solvers are central to logic synthesis and verification systems. The increasing popularity of these constraint problems in electronic design automation encourages studies on different SAT problems and their properties for further computational efficiency. There has been both theoretical and practical success of modern Conflict-driven clause learning SAT solvers, which allows solving very large industrial instances in a relatively short amount of time. Recently, machine learning approaches provide a new dimension to solving this challenging problem. Neural symbolic models could serve as generic solvers that can be specialized for specific domains based on data without any changes to the structure of the model. In this work, we propose a one-shot model derived from the Transformer architecture to solve the MaxSAT problem, which is the optimization version of SAT where the goal is to satisfy the maximum number of clauses. Our model has a scale-free structure which could process varying size of instances. We use meta-path and self-attention mechanism to capture interactions among homogeneous nodes. We adopt cross-attention mechanisms on the bipartite graph to capture interactions among heterogeneous nodes. We further apply an iterative algorithm to our model to satisfy additional clauses, enabling a solution approaching that of an exact-SAT problem. The attention mechanisms leverage the parallelism for speedup. Our evaluation indicates improved speedup compared to heuristic approaches and improved completion rate compared to machine learning approaches.
Background: Previous studies have shown that up to 99.59 % of the Java apps using crypto APIs misuse the API at least once. However, these studies have been conducted on Java and C, while empirical studies for other languages are missing. For example, a controlled user study with crypto tasks in Python has shown that 68.5 % of the professional developers write a secure solution for a crypto task. Aims: To understand if this observation holds for real-world code, we conducted a study of crypto misuses in Python. Method: We developed a static analysis tool that covers common misuses of 5 different Python crypto APIs. With this analysis, we analyzed 895 popular Python projects from GitHub and 51 MicroPython projects for embedded devices. Further, we compared our results with the findings of previous studies. Results: Our analysis reveals that 52.26 % of the Python projects have at least one misuse. Further, some Python crypto libraries API design helps developers from misusing crypto functions, which were much more common in studies conducted with Java and C code. Conclusion: We conclude that we can see a positive impact of the good API design on crypto misuses for Python applications. Further, our analysis of MicroPython projects reveals the importance of hybrid analyses.