So You Want to Analyze Scheme Programs With Datalog?

90 0 0.0 ( 0 )

Download Cite

Added by Yihao Sun

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Davis Ross Silverman - Yihao Sun - Kristopher Micinski

Programming Languages

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Static analysis approximates the results of a program by examining only its syntax. For example, control-flow analysis (CFA) determines which syntactic lambdas (for functional languages) or (for object-oriented) methods may be invoked at each call site within a program. Rich theoretical results exist studying control flow analysis for Scheme-like languages, but implementations are often complex and specialized. By contrast, object-oriented languages (Java in particular) enjoy high-precision control-flow analyses that scale to thousands (or more) of lines of code. State-of-the-art implementations (such as DOOP on Souffle) structure the analysis using Horn-SAT (Datalog) to enable compilation of the analysis to efficient implementations such as high-performance relational algebra kernels. In this paper, we present an implementation of control-flow analysis for a significant subset of Scheme (including set!, call/cc, and primitive operations) using the Souffle Datalog engine. We present an evaluation on a worst-case term demonstrating the polynomial complexity of our m-CFA and remark upon scalability results using Souffle.

rate research

Variability-aware Datalog

364 - Ramy Shahin , Marsha Chechik 2019

Variability-aware computing is the efficient application of programs to different sets of inputs that exhibit some variability. One example is program analyses applied to Software Product Lines (SPLs). In this paper we present the design and development of a variability-aware version of the Souffl{e} Datalog engine. The engine can take facts annotated with Presence Conditions (PCs) as input, and compute the PCs of its inferred facts, eliminating facts that do not exist in any valid configuration. We evaluate our variability-aware Souffl{e} implementation on several fact sets annotated with PCs to measure the associated overhead in terms of processing time and database size.

Programming Languages Databases Software Engineering

Magic Sets for Disjunctive Datalog Programs

408 - Mario Alviano , Wolfgang Faber , Gianluigi Greco 2012

In this paper, a new technique for the optimization of (partially) bound queries over disjunctive Datalog programs with stratified negation is presented. The technique exploits the propagation of query bindings and extends the Magic Set (MS) optimization technique. An important feature of disjunctive Datalog is nonmonotonicity, which calls for nondeterministic implementations, such as backtracking search. A distinguishing characteristic of the new method is that the optimization can be exploited also during the nondeterministic phase. In particular, after some assumptions have been made during the computation, parts of the program may become irrelevant to a query under these assumptions. This allows for dynamic pruning of the search space. In contrast, the effect of the previously defined MS methods for disjunctive Datalog is limited to the deterministic portion of the process. In this way, the potential performance gain by using the proposed method can be exponential, as could be observed empirically. The correctness of MS is established thanks to a strong relationship between MS and unfounded sets that has not been studied in the literature before. This knowledge allows for extending the method also to programs with stratified negation in a natural way. The proposed method has been implemented in DLV and various experiments have been conducted. Experimental results on synthetic data confirm the utility of MS for disjunctive Datalog, and they highlight the computational gain that may be obtained by the new method w.r.t. the previously proposed MS methods for disjunctive Datalog programs. Further experiments on real-world data show the benefits of MS within an application scenario that has received considerable attention in recent years, the problem of answering user queries over possibly inconsistent databases originating from integration of autonomous sources of information.

Artificial Intelligence Logic in Computer Science

Extending programs with debug-related features, with application to hardware development

144 - Nik Sultana , Salvator Galea , David Greaves 2017

The capacity and programmability of reconfigurable hardware such as FPGAs has improved steadily over the years, but they do not readily provide any mechanisms for monitoring or debugging running programs. Such mechanisms need to be written into the program itself. This is done using ad hoc methods and primitive tools when compared to CPU programming. This complicates the programming and debugging of reconfigurable hardware. We introduce Program-hosted Directability (PhD), the extension of programs to interpret direction commands at runtime to enable debugging, monitoring and profiling. Normally in hardware development such features are fixed at compile time. We present a language of directing commands, specify its semantics in terms of a simple controller that is embedded with programs, and implement a prototype for directing network programs running in hardware. We show that this approach affords significant flexibility with low impact on hardware utilisation and performance.

Programming Languages

Remember What You Want to Forget: Algorithms for Machine Unlearning

339 - Ayush Sekhari , Jayadev Acharya , Gautam Kamath 2021

We study the problem of unlearning datapoints from a learnt model. The learner first receives a dataset $S$ drawn i.i.d. from an unknown distribution, and outputs a model $widehat{w}$ that performs well on unseen samples from the same distribution. However, at some point in the future, any training datapoint $z in S$ can request to be unlearned, thus prompting the learner to modify its output model while still ensuring the same accuracy guarantees. We initiate a rigorous study of generalization in machine unlearning, where the goal is to perform well on previously unseen datapoints. Our focus is on both computational and storage complexity. For the setting of convex losses, we provide an unlearning algorithm that can unlearn up to $O(n/d^{1/4})$ samples, where $d$ is the problem dimension. In comparison, in general, differentially private learning (which implies unlearning) only guarantees deletion of $O(n/d^{1/2})$ samples. This demonstrates a novel separation between differential privacy and machine unlearning.

Machine Learning Artificial Intelligence

From Concurrent Programs to Simulating Sequential Programs: Correctness of a Transformation

115 - Allan Blanchard 2017

Frama-C is a software analysis framework that provides a common infrastructure and a common behavioral specification language to plugins that implement various static and dynamic analyses of C programs. Most plugins do not support concurrency. We have proposed Conc2Seq, a Frama-C plugin based on program transformation, capable to leverage the existing huge code base of plugins and to handle concurrent C programs. In this paper we formalize and sketch the proof of correctness of the program transformation principle behind Conc2Seq, and present an effort towards the full mechanization of both the formalization and proofs with the proof assistant Coq.

Programming Languages