No Arabic abstract
We present Calyx, a new intermediate language (IL) for compiling high-level programs into hardware designs. Calyx combines a hardware-like structural language with a software-like control flow representation with loops and conditionals. This split representation enables a new class of hardware-focused optimizations that require both structural and control flow information which are crucial for high-level programming models for hardware design. The Calyx compiler lowers control flow constructs using finite-state machines and generates synthesizable hardware descriptions. We have implemented Calyx in an optimizing compiler that translates high-level programs to hardware. We demonstrate Calyx using two DSL-to-RTL compilers, a systolic array generator and one for a recent imperative accelerator language, and compare them to equivalent designs generated using high-level synthesis (HLS). The systolic arrays are $4.6times$ faster and $1.1times$ larger on average than HLS implementations, and the HLS-like imperative language compiler is within a few factors of a highly optimized commercial HLS toolchain. We also describe three optimizations implemented in the Calyx compiler.
Deep learning software demands reliability and performance. However, many of the existing deep learning frameworks are software libraries that act as an unsafe DSL in Python and a computation graph interpreter. We present DLVM, a design and implementation of a compiler infrastructure with a linear algebra intermediate representation, algorithmic differentiation by adjoint code generation, domain-specific optimizations and a code generator targeting GPU via LLVM. Designed as a modern compiler infrastructure inspired by LLVM, DLVM is more modular and more generic than existing deep learning compiler frameworks, and supports tensor DSLs with high expressivity. With our prototypical staged DSL embedded in Swift, we argue that the DLVM system enables a form of modular, safe and performant frameworks for deep learning.
This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. MLIR facilitates the design and implementation of code generators, translators and optimizers at different levels of abstraction and also across application domains, hardware targets and execution environments. The contribution of this work includes (1) discussion of MLIR as a research artifact, built for extension and evolution, and identifying the challenges and opportunities posed by this novel design point in design, semantics, optimization specification, system, and engineering. (2) evaluation of MLIR as a generalized infrastructure that reduces the cost of building compilers-describing diverse use-cases to show research and educational opportunities for future programming languages, compilers, execution environments, and computer architecture. The paper also presents the rationale for MLIR, its original design principles, structures and semantics.
Field-programmable gate arrays (FPGAs) provide an opportunity to co-design applications with hardware accelerators, yet they remain difficult to program. High-level synthesis (HLS) tools promise to raise the level of abstraction by compiling C or C++ to accelerator designs. Repurposing legacy software languages, however, requires complex heuristics to map imperative code onto hardware structures. We find that the black-box heuristics in HLS can be unpredictable: changing parameters in the program that should improve performance can counterintuitively yield slower and larger designs. This paper proposes a type system that restricts HLS to programs that can predictably compile to hardware accelerators. The key idea is to model consumable hardware resources with a time-sensitive affine type system that prevents simultaneous uses of the same hardware structure. We implement the type system in Dahlia, a language that compiles to HLS C++, and show that it can reduce the size of HLS parameter spaces while accepting Pareto-optimal designs.
REC (REGULAR EXPRESSION COMPILER) is a programming language of simple structure developed originally for the PDP-8 computer of the Digital Equipment, Corporation, but readily adaptable to any other general purpose computer. It has been used extensively in teaching Algebra and Numerical Analysis in the Escuela Superior de Fisica y Matematicas of the Instituto Politecnico Nacional. Moreover, the fact that the same control language, REC, is equally applicable and equally efficient over the whole range of computer facilities available to the students gives a very welcome coherence to the entire teaching program, including the course of Mathematical Logic which is devoted to the theoretical aspects of such matters. REC; derives its appeal from the fact that computers can be regarded reasonably well as Turing Machines. The REC notation is simply a manner of writing regular expression, somewhat more amenable to programming the Turing Machine which they control. If one does not wish to think so strictly in terms of Turing Machines, REC expressions still provide a means of defining the flow of control in a program which is quite convenient for many applications.
We report on updates to the accelerator controls for the Neutralized Drift Compression Experiment II, a pulsed induction-type accelerator for heavy ions. The control infrastructure is built around a LabVIEW interface combined with an Apache Cassandra backend for data archiving. Recent upgrades added the storing and retrieving of device settings into the database, as well as ZeroMQ as a message broker that replaces LabVIEWs shared variables. Converting to ZeroMQ also allows easy access via other programming languages, such as Python.