No Arabic abstract
Domain-specific optimizing compilers have demonstrated significant performance and portability benefits, but require programs to be represented in their specialized IRs. Existing frontends to these compilers suffer from the language subset problem where some host language features are unsupported in the subset of the users program that interacts with the domain-specific compiler. By contrast, define-by-run ML frameworks-colloquially called eager mode-are popular due to their ease of use and expressivity, where the full power of the host programming language can be used. LazyTensor is a technique to target domain specific compilers without sacrificing define-by-run ergonomics. Initially developed to support PyTorch on Cloud TPUs, the technique, along with a substantially shared implementation, has been used by Swift for TensorFlow across CPUs, GPUs, and TPUs, demonstrating the generality of the approach across (1) Tensor implementations, (2) hardware accelerators, and (3) programming languages.
Program synthesis from input-output examples has been a long-standing challenge, and recent works have demonstrated some success in designing deep neural networks for program synthesis. However, existing efforts in input-output neural program synthesis have been focusing on domain-specific languages, thus the applicability of previous approaches to synthesize code in full-fledged popular programming languages, such as C, remains a question. The main challenges lie in two folds. On the one hand, the program search space grows exponentially when the syntax and semantics of the programming language become more complex, which poses higher requirements on the synthesis algorithm. On the other hand, increasing the complexity of the programming language also imposes more difficulties on data collection, since building a large-scale training set for input-output program synthesis require random program generators to sample programs and input-output examples. In this work, we take the first step to synthesize C programs from input-output examples. In particular, we propose LaSynth, which learns the latent representation to approximate the execution of partially generated programs, even if their semantics are not well-defined. We demonstrate the possibility of synthesizing elementary C code from input-output examples, and leveraging learned execution significantly improves the prediction performance over existing approaches. Meanwhile, compared to the randomly generated ground-truth programs, LaSynth synthesizes more concise programs that resemble human-written code. We show that training on these synthesized programs further improves the prediction performance for both Karel and C program synthesis, indicating the promise of leveraging the learned program synthesizer to improve the dataset quality for input-output program synthesis.
TensorFlow Eager is a multi-stage, Python-embedded domain-specific language for hardware-accelerated machine learning, suitable for both interactive research and production. TensorFlow, which TensorFlow Eager extends, requires users to represent computations as dataflow graphs; this permits compiler optimizations and simplifies deployment but hinders rapid prototyping and run-time dynamism. TensorFlow Eager eliminates these usability costs without sacrificing the benefits furnished by graphs: It provides an imperative front-end to TensorFlow that executes operations immediately and a JIT tracer that translates Python functions composed of TensorFlow operations into executable dataflow graphs. TensorFlow Eager thus offers a multi-stage programming model that makes it easy to interpolate between imperative and staged execution in a single package.
Symbolic execution is a powerful technique for program analysis. However, it has many limitations in practical applicability: the path explosion problem encumbers scalability, the need for language-specific implementation, the inability to handle complex dependencies, and the limited expressiveness of theories supported by underlying satisfiability checkers. Often, relationships between variables of interest are not expressible directly as purely symbolic constraints. To this end, we present a new approach -- neuro-symbolic execution -- which learns an approximation of the relationship as a neural net. It features a constraint solver that can solve mixed constraints, involving both symbolic expressions and neural network representation. To do so, we envision such constraint solving as procedure combining SMT solving and gradient-based optimization. We demonstrate the utility of neuro-symbolic execution in constructing exploits for buffer overflows. We report success on 13/14 programs which have difficult constraints, known to require specialized extensions to symbolic execution. In addition, our technique solves $100$% of the given neuro-symbolic constraints in $73$ programs from standard verification and invariant synthesis benchmarks.
The SPaCIFY project, which aims at bringing advances in MDE to the satellite flight software industry, advocates a top-down approach built on a domain-specific modeling language named Synoptic. In line with previous approaches to real-time modeling such as Statecharts and Simulink, Synoptic features hierarchical decomposition of application and control modules in synchronous block diagrams and state machines. Its semantics is described in the polychronous model of computation, which is that of the synchronous language Signal.
At the workshop on Trends in Functional Programming in Education (TFPIE) in 2015 Ionescu and Jansson presented the approach underlying the Domain Specific Languages of Mathematics (DSLsofMath) course even before the first course instance. We were then encouraged to come back to present our experience and the student results. Now, three years later, we have seen three groups of learners attend the course, and the first two groups have also continued on to take challenging courses in the subsequent year. In this paper we present three examples from the course material to set the scene, and we present an evaluation of the student results showing improvements in the pass rates and grades in later courses.