Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning

80 0 0.0 ( 0 )

Download Cite

Added by Akshay Agrawal

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Akshay Agrawal - Akshay Naresh Modi - Alexandre Passos

Programming Languages Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

TensorFlow Eager is a multi-stage, Python-embedded domain-specific language for hardware-accelerated machine learning, suitable for both interactive research and production. TensorFlow, which TensorFlow Eager extends, requires users to represent computations as dataflow graphs; this permits compiler optimizations and simplifies deployment but hinders rapid prototyping and run-time dynamism. TensorFlow Eager eliminates these usability costs without sacrificing the benefits furnished by graphs: It provides an imperative front-end to TensorFlow that executes operations immediately and a JIT tracer that translates Python functions composed of TensorFlow operations into executable dataflow graphs. TensorFlow Eager thus offers a multi-stage programming model that makes it easy to interpolate between imperative and staged execution in a single package.

rate research

TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems

121 - Robert David , Jared Duke , Advait Jain 2020

Deep learning inference on embedded devices is a burgeoning field with myriad applications because tiny embedded devices are omnipresent. But we must overcome major challenges before we can benefit from this opportunity. Embedded processors are severely resource constrained. Their nearest mobile counterparts exhibit at least a 100 -- 1,000x difference in compute capability, memory availability, and power consumption. As a result, the machine-learning (ML) models and associated ML inference framework must not only execute efficiently but also operate in a few kilobytes of memory. Also, the embedded devices ecosystem is heavily fragmented. To maximize efficiency, system vendors often omit many features that commonly appear in mainstream systems, including dynamic memory allocation and virtual memory, that allow for cross-platform interoperability. The hardware comes in many flavors (e.g., instruction-set architecture and FPU support, or lack thereof). We introduce TensorFlow Lite Micro (TF Micro), an open-source ML inference framework for running deep-learning models on embedded systems. TF Micro tackles the efficiency requirements imposed by embedded-system resource constraints and the fragmentation challenges that make cross-platform interoperability nearly impossible. The framework adopts a unique interpreter-based approach that provides flexibility while overcoming these challenges. This paper explains the design decisions behind TF Micro and describes its implementation details. Also, we present an evaluation to demonstrate its low resource requirement and minimal run-time performance overhead.

Machine Learning Artificial Intelligence

LazyTensor: combining eager execution with domain-specific compilers

93 - Alex Suhan , Davide Libenzi , Ailing Zhang 2021

Domain-specific optimizing compilers have demonstrated significant performance and portability benefits, but require programs to be represented in their specialized IRs. Existing frontends to these compilers suffer from the language subset problem where some host language features are unsupported in the subset of the users program that interacts with the domain-specific compiler. By contrast, define-by-run ML frameworks-colloquially called eager mode-are popular due to their ease of use and expressivity, where the full power of the host programming language can be used. LazyTensor is a technique to target domain specific compilers without sacrificing define-by-run ergonomics. Initially developed to support PyTorch on Cloud TPUs, the technique, along with a substantially shared implementation, has been used by Swift for TensorFlow across CPUs, GPUs, and TPUs, demonstrating the generality of the approach across (1) Tensor implementations, (2) hardware accelerators, and (3) programming languages.

Programming Languages Machine Learning

Relay: A New IR for Machine Learning Frameworks

198 - Jared Roesch , Steven Lyubomirsky , Logan Weber 2018

Machine learning powers diverse services in industry including search, translation, recommendation systems, and security. The scale and importance of these models require that they be efficient, expressive, and portable across an array of heterogeneous hardware devices. These constraints are often at odds; in order to better accommodate them we propose a new high-level intermediate representation (IR) called Relay. Relay is being designed as a purely-functional, statically-typed language with the goal of balancing efficient compilation, expressiveness, and portability. We discuss the goals of Relay and highlight its important design constraints. Our prototype is part of the open source NNVM compiler framework, which powers Amazons deep learning framework MxNet.

Programming Languages Machine Learning

CausalML: Python Package for Causal Machine Learning

73 - Huigang Chen , Totte Harinen , Jeong-Yoon Lee 2020

CausalML is a Python implementation of algorithms related to causal inference and machine learning. Algorithms combining causal inference and machine learning have been a trending topic in recent years. This package tries to bridge the gap between theoretical work on methodology and practical applications by making a collection of methods in this field available in Python. This paper introduces the key concepts, scope, and use cases of this package.

Computers and Society Machine Learning Computation

TensorFlow: A system for large-scale machine learning

144 - Martin Abadi , Paul Barham , Jianmin Chen 2016

TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous parameter server designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

Distributed Parallel and Cluster Computing Artificial Intelligence

TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions