ESP4ML: Platform-Based Design of Systems-on-Chip for Embedded Machine Learning

119 0 0.0 ( 0 )

Download Cite

Added by Davide Giri

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Davide Giri - Kuan-Lin Chiu - Giuseppe Di Guglielmo

Hardware Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We present ESP4ML, an open-source system-level design flow to build and program SoC architectures for embedded applications that require the hardware acceleration of machine learning and signal processing algorithms. We realized ESP4ML by combining two established open-source projects (ESP and HLS4ML) into a new, fully-automated design flow. For the SoC integration of accelerators generated by HLS4ML, we designed a set of new parameterized interface circuits synthesizable with high-level synthesis. For accelerator configuration and management, we developed an embedded software runtime system on top of Linux. With this HW/SW layer, we addressed the challenge of dynamically shaping the data traffic on a network-on-chip to activate and support the reconfigurable pipelines of accelerators that are needed by the application workloads currently running on the SoC. We demonstrate our vertically-integrated contributions with the FPGA-based implementations of complete SoC instances booting Linux and executing computer-vision applications that process images taken from the Google Street View database.

rate research

TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems

121 - Robert David , Jared Duke , Advait Jain 2020

Deep learning inference on embedded devices is a burgeoning field with myriad applications because tiny embedded devices are omnipresent. But we must overcome major challenges before we can benefit from this opportunity. Embedded processors are severely resource constrained. Their nearest mobile counterparts exhibit at least a 100 -- 1,000x difference in compute capability, memory availability, and power consumption. As a result, the machine-learning (ML) models and associated ML inference framework must not only execute efficiently but also operate in a few kilobytes of memory. Also, the embedded devices ecosystem is heavily fragmented. To maximize efficiency, system vendors often omit many features that commonly appear in mainstream systems, including dynamic memory allocation and virtual memory, that allow for cross-platform interoperability. The hardware comes in many flavors (e.g., instruction-set architecture and FPU support, or lack thereof). We introduce TensorFlow Lite Micro (TF Micro), an open-source ML inference framework for running deep-learning models on embedded systems. TF Micro tackles the efficiency requirements imposed by embedded-system resource constraints and the fragmentation challenges that make cross-platform interoperability nearly impossible. The framework adopts a unique interpreter-based approach that provides flexibility while overcoming these challenges. This paper explains the design decisions behind TF Micro and describes its implementation details. Also, we present an evaluation to demonstrate its low resource requirement and minimal run-time performance overhead.

Machine Learning Artificial Intelligence

A Cloud-Based Collaboration Platform for Model-Based Design of Cyber-Physical Systems

360 - Peter Gorm Larsen , Hugo Daniel Macedo , John Fitzgerald andn Holger Pfeifer 2020

Businesses, particularly small and medium-sized enterprises, aiming to start up in Model-Based Design (MBD) face difficult choices from a wide range of methods, notations and tools before making the significant investments in planning, procurement and training necessary to deploy new approaches successfully. In the development of Cyber-Physical Systems (CPSs) this is exacerbated by the diversity of formalisms covering computation, physical and human processes. In this paper, we propose the use of a cloud-enabled and open collaboration platform that allows businesses to offer models, tools and other assets, and permits others to access these on a pay-per-use basis as a means of lowering barriers to the adoption of MBD technology, and to promote experimentation in a sandbox environment.

Systems and Control Software Engineering Systems and Control

On the Optimal Design of Triple Modular Redundancy Logic for SRAM-based FPGAs

575 - F. Lima Kastensmidt , L. Sterpone , L. Carro 2007

Triple Modular Redundancy (TMR) is a suitable fault tolerant technique for SRAM-based FPGA. However, one of the main challenges in achieving 100% robustness in designs protected by TMR running on programmable platforms is to prevent upsets in the routing from provoking undesirable connections between signals from distinct redundant logic parts, which can generate an error in the output. This paper investigates the optimal design of the TMR logic (e.g., by cleverly inserting voters) to ensure robustness. Four differe

Hardware Architecture

Open Tiled Manycore System-on-Chip

386 - Stefan Wallentowitz , Philipp Wagner , Michael Tempelmeier 2013

Manycore System-on-Chip include an increasing amount of processing elements and have become an important research topic for improvements of both hardware and software. While research can be conducted using system simulators, prototyping requires a variety of components and is very time consuming. With the Open Tiled Manycore System-on-Chip (OpTiMSoC) we aim at building such an environment for use in our and other research projects as prototyping platform. This paper describes the project goals and aspects of OpTiMSoC and summarizes the current status and ideas.

Hardware Architecture

An atomic Boltzmann machine capable of on-chip learning

168 - Brian Kiraly , Elze J. Knol , Hilbert J. Kappen 2020

The Boltzmann Machine (BM) is a neural network composed of stochastically firing neurons that can learn complex probability distributions by adapting the synaptic interactions between the neurons. BMs represent a very generic class of stochastic neural networks that can be used for data clustering, generative modelling and deep learning. A key drawback of software-based stochastic neural networks is the required Monte Carlo sampling, which scales intractably with the number of neurons. Here, we realize a physical implementation of a BM directly in the stochastic spin dynamics of a gated ensemble of coupled cobalt atoms on the surface of semiconducting black phosphorus. Implementing the concept of orbital memory utilizing scanning tunnelling microscopy, we demonstrate the bottom-up construction of atomic ensembles whose stochastic current noise is defined by a reconfigurable multi-well energy landscape. Exploiting the anisotropic behaviour of black phosphorus, we build ensembles of atoms with two well-separated intrinsic time scales that represent neurons and synapses. By characterizing the conditional steady-state distribution of the neurons for given synaptic configurations, we illustrate that an ensemble can represent many distinct probability distributions. By probing the intrinsic synaptic dynamics, we reveal an autonomous reorganization of the synapses in response to external electrical stimuli. This self-adaptive architecture paves the way for on-chip learning directly in atomic-scale machine learning hardware.

Mesoscale and Nanoscale Physics Disordered Systems and Neural Networks Materials Science