Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Memory-Aware Partitioning of Machine Learning Applications for Optimal Energy Use in Batteryless Systems

126 0 0.0 ( 0 )

Download Cite

Added by Andres Gomez

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Andres Gomez - Andreas Tretter - Pascal Alexander Hager

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Sensing systems powered by energy harvesting have traditionally been designed to tolerate long periods without energy. As the Internet of Things (IoT) evolves towards a more transient and opportunistic execution paradigm, reducing energy storage costs will be key for its economic and ecologic viability. However, decreasing energy storage in harvesting systems introduces reliability issues. Transducers only produce intermittent energy at low voltage and current levels, making guaranteed task completion a challenge. Existing ad hoc methods overcome this by buffering enough energy either for single tasks, incurring large data-retention overheads, or for one full application cycle, requiring a large energy buffer. We present Julienning: an automated method for optimizing the total energy cost of batteryless applications. Using a custom specification model, developers can describe transient applications as a set of atomically executed kernels with explicit data dependencies. Our optimization flow can partition data- and energy-intensive applications into multiple execution cycles with bounded energy consumption. By leveraging interkernel data dependencies, these energy-bounded execution cycles minimize the number of system activations and nonvolatile data transfers, and thus the total energy overhead. We validate our methodology with two batteryless cameras running energy-intensive machine learning applications. Results demonstrate that compared to ad hoc solutions, our method can reduce the required energy storage by over 94% while only incurring a 0.12% energy overhead.

rate research

A Machine Learning Accelerator In-Memory for Energy Harvesting

122 - Salonik Resch , S. Karen Khatamifard , Zamshed Iqbal Chowdhury 2019

There is increasing demand to bring machine learning capabilities to low power devices. By integrating the computational power of machine learning with the deployment capabilities of low power devices, a number of new applications become possible. In some applications, such devices will not even have a battery, and must rely solely on energy harvesting techniques. This puts extreme constraints on the hardware, which must be energy efficient and capable of tolerating interruptions due to power outages. Here, as a representative example, we propose an in-memory support vector machine learning accelerator utilizing non-volatile spintronic memory. The combination of processing-in-memory and non-volatility provides a key advantage in that progress is effectively saved after every operation. This enables instant shut down and restart capabilities with minimal overhead. Additionally, the operations are highly energy efficient leading to low power consumption.

Emerging Technologies Hardware Architecture Distributed Parallel and Cluster Computing

Partitioning SKA Dataflows for Optimal Graph Execution

117 - Chen Wu , Andreas Wicenec , Rodrigo Tobar 2018

Optimizing data-intensive workflow execution is essential to many modern scientific projects such as the Square Kilometre Array (SKA), which will be the largest radio telescope in the world, collecting terabytes of data per second for the next few decades. At the core of the SKA Science Data Processor is the graph execution engine, scheduling tens of thousands of algorithmic components to ingest and transform millions of parallel data chunks in order to solve a series of large-scale inverse problems within the power budget. To tackle this challenge, we have developed the Data Activated Liu Graph Engine (DALiuGE) to manage data processing pipelines for several SKA pathfinder projects. In this paper, we discuss the DALiuGE graph scheduling sub-system. By extending previous studies on graph scheduling and partitioning, we lay the foundation on which we can develop polynomial time optimization methods that minimize both workflow execution time and resource footprint while satisfying resource constraints imposed by individual algorithms. We show preliminary results obtained from three radio astronomy data pipelines.

Distributed Parallel and Cluster Computing

Machine Learning for Performance Prediction of Spark Cloud Applications

83 - Alexandre Maros , Fabricio Murai , Ana Paula Couto da Silva andn Jussara M. Almeida 2021

Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the underlying resources at runtime. Machine Learning (ML), providing black box solutions to model the relationship between application performance and system configuration without requiring in-detail knowledge of the system, has become a popular way of predicting the performance of big data applications. We investigate the cost-benefits of using supervised ML models for predicting the performance of applications on Spark, one of todays most widely used frameworks for big data analysis. We compare our approach with textit{Ernest} (an ML-based technique proposed in the literature by the Spark inventors) on a range of scenarios, application workloads, and cloud system configurations. Our experiments show that Ernest can accurately estimate the performance of very regular applications, but it fails when applications exhibit more irregular patterns and/or when extrapolating on bigger data set sizes. Results show that our models match or exceed Ernests performance, sometimes enabling us to reduce the prediction error from 126-187% to only 5-19%.

Distributed Parallel and Cluster Computing Performance

Towards Optimal Kinetic Energy Harvesting for the Batteryless IoT

102 - Muhammad Moid Sandhu , Kai Geissdoerfer , Sara Khalifa 2020

Traditional Internet of Things (IoT) sensors rely on batteries that need to be replaced or recharged frequently which impedes their pervasive deployment. A promising alternative is to employ energy harvesters that convert the environmental energy into electrical energy. Kinetic Energy Harvesting (KEH) converts the ambient motion/vibration energy into electrical energy to power the IoT sensor nodes. However, most previous works employ KEH without dynamically tracking the optimal operating point of the transducer for maximum power output. In this paper, we systematically analyse the relation between the operating point of the transducer and the corresponding energy yield. To this end, we explore the voltage-current characteristics of the KEH transducer to find its Maximum Power Point (MPP). We show how this operating point can be approximated in a practical energy harvesting circuit. We design two hardware circuit prototypes to evaluate the performance of the proposed mechanism and analyse the harvested energy using a precise load shaker under a wide set of controlled conditions typically found in human-centric applications. We analyse the dynamic current-voltage characteristics and specify the relation between the MPP sampling rate and harvesting efficiency which outlines the need for dynamic MPP tracking. The results show that the proposed energy harvesting mechanism outperforms the conventional method in terms of generated power and offers at least one order of magnitude higher power than the latter.

Signal Processing

Optimal Resilience in Systems that Mix Shared Memory and Message Passing

85 - Hagit Attiya , Sweta Kumari , 2020

We investigate the minimal number of failures that can partition a system where processes communicate both through shared memory and by message passing. We prove that this number precisely captures the resilience that can be achieved by algorithms that implement a variety of shared objects, like registers and atomic snapshots, and solve common tasks, like randomized consensus, approximate agreement and renaming. This has implications for the m&m-model and for the hybrid, cluster-based model.

Distributed Parallel and Cluster Computing

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Memory-Aware Partitioning of Machine Learning Applications for Optimal Energy Use in Batteryless Systems

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions