Cross Architectural Power Modelling

46 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Blesson Varghese

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Kai Chen - Peter Kilpatrick - Dimitrios S. Nikolopoulos andn Blesson Varghese

النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Existing power modelling research focuses on the model rather than the process for developing models. An automated power modelling process that can be deployed on different processors for developing power models with high accuracy is developed. For this, (i) an automated hardware performance counter selection method that selects counters best correlated to power on both ARM and Intel processors, (ii) a noise filter based on clustering that can reduce the mean error in power models, and (iii) a two stage power model that surmounts challenges in using existing power models across multiple architectures are proposed and developed. The key results are: (i) the automated hardware performance counter selection method achieves comparable selection to the manual method reported in the literature, (ii) the noise filter reduces the mean error in power models by up to 55%, and (iii) the two stage power model can predict dynamic power with less than 8% error on both ARM and Intel processors, which is an improvement over classic models.

قيم البحث

109 - Udit Gupta , Carole-Jean Wu , Xiaodong Wang 2019

The widespread application of deep learning has changed the landscape of computation in the data center. In particular, personalized recommendation for content ranking is now largely accomplished leveraging deep neural networks. However, despite the importance of these models and the amount of compute cycles they consume, relatively little research attention has been devoted to systems for recommendation. To facilitate research and to advance the understanding of these workloads, this paper presents a set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation. In addition to releasing a set of open-source workloads, we conduct in-depth analysis that underpins future system design and optimization for at-scale recommendation: Inference latency varies by 60% across three Intel server generations, batching and co-location of inferences can drastically improve latency-bounded throughput, and the diverse composition of recommendation models leads to different optimization strategies.

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

A Nested Cross Decomposition Algorithm for Power System Capacity Expansion with Multiscale Uncertainties

110 - Zhouchun Huang , Qipeng P. Zheng , Andrew L. Liu 2021

Modern electric power systems have witnessed rapidly increasing penetration of renewable energy, storage, electrical vehicles and various demand response resources. The electric infrastructure planning is thus facing more challenges due to the variab ility and uncertainties arising from the diverse new resources. This study aims to develop a multistage and multiscale stochastic mixed integer programming (MM-SMIP) model to capture both the coarse-temporal-scale uncertainties, such as investment cost and long-run demand stochasticity, and fine-temporal-scale uncertainties, such as hourly renewable energy output and electricity demand uncertainties, for the power system capacity expansion problem. To be applied to a real power system, the resulting model will lead to extremely large-scale mixed integer programming problems, which suffer not only the well-known curse of dimensionality, but also computational difficulties with a vast number of integer variables at each stage. In addressing such challenges associated with the MM-SMIP model, we propose a nested cross decomposition algorithm that consists of two layers of decomposition, that is, the Dantzig-Wolfe decomposition and L-shaped decomposition. The algorithm exhibits promising computational performance under our numerical study, and is especially amenable to parallel computing, which will also be demonstrated through the computational results.

النظم الموزعة والتوازية والحوسبة العنقودية التحسين والتحكم

Project Beehive: A Hardware/Software Co-designed Stack for Runtime and Architectural Research

121 - Christos Kotselidis , Andrey Rodchenko , Colin Barrett 2015

The end of Dennard scaling combined with stagnation in architectural and compiler optimizations makes it challenging to achieve significant performance deltas. Solutions based solely in hardware or software are no longer sufficient to maintain the pa ce of improvements seen during the past few decades. In hardware, the end of single-core scaling resulted in the proliferation of multi-core system architectures, however this has forced complex parallel programming techniques into the mainstream. To further exploit physical resources, systems are becoming increasingly heterogeneous with specialized computing elements and accelerators. Programming across a range of disparate architectures requires a new level of abstraction that programming languages will have to adapt to. In software, emerging complex applications, from domains such as Big Data and computer vision, run on multi-layered software stacks targeting hardware with a variety of constraints and resources. Hence, optimizing for the power-performance (and resiliency) space requires experimentation platforms that offer quick and easy prototyping of hardware/software co-designed techniques. To that end, we present Project Beehive: A Hardware/Software co-designed stack for runtime and architectural research. Project Beehive utilizes various state-of-the-art software and hardware components along with novel and extensible co-design techniques. The objective of Project Beehive is to provide a modern platform for experimentation on emerging applications, programming languages, compilers, runtimes, and low-power heterogeneous many-core architectures in a full-system co-designed manner.

النظم الموزعة والتوازية والحوسبة العنقودية

MIP An AI Distributed Architectural Model to Introduce Cognitive computing capabilities in Cyber Physical Systems (CPS)

50 - Pasquale Giampa , Massimiliano Dibitonto (1 2020

This paper introduces the MIP Platform architecture model, a novel AI-based cognitive computing platform architecture. The goal of the proposed application of MIP is to reduce the implementation burden for the usage of AI algorithms applied to cognit ive computing and fluent HMI interactions within the manufacturing process in a cyber-physical production system. The cognitive inferencing engine of MIP is a deterministic cognitive module that processes declarative goals, identifies Intents and Entities, selects suitable actions and associated algorithms, and invokes for the execution a processing logic (Function) configured in the internal Function-as-aService or Connectivity Engine. Constant observation and evaluation against performance criteria assess the performance of Lambda(s) for many and varying scenarios. The modular design with well-defined interfaces enables the reusability and extensibility of FaaS components. An integrated BigData platform implements this modular design supported by technologies such as Docker, Kubernetes for virtualization and orchestration of the individual components and their communication. The implementation of the architecture is evaluated using a real-world use case later discussed in this paper.

النظم الموزعة والتوازية والحوسبة العنقودية

Stochastic modelling of blockchain consensus

392 - Claudio J. Tessone , Paolo Tasca , Flavio Iannelli 2021

Blockchain and general purpose distributed ledgers are foundational technologies which bring significant innovation in the infrastructures and other underpinnings of our socio-economic systems. These P2P technologies are able to securely diffuse info rmation within and across networks, without need for trustees or central authorities to enforce consensus. In this contribution, we propose a minimalistic stochastic model to understand the dynamics of blockchain-based consensus. By leveraging on random-walk theory, we model block propagation delay on different network topologies and provide a classification of blockchain systems in terms of two emergent properties. Firstly, we identify two performing regimes: a functional regime corresponding to an optimal system function; and a non-functional regime characterised by a congested or branched state of sub-optimal blockchains. Secondly, we discover a phase transition during the emergence of consensus and numerically investigate the corresponding critical point. Our results provide important insights into the consensus mechanism and sub-optimal states in decentralised systems.

النظم الموزعة والتوازية والحوسبة العنقودية الأنظمة المضطربة والشبكات العصبية الفيزياء والمجتمع

سجل دخول لتتمكن من نشر تعليقات