Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Large Scale Low Power Computing System - Status of Network Design in ExaNeSt and EuroExa Projects

113 0 0.0 ( 0 )

Download Cite

Added by Andrea Biagioni

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Roberto Ammendola - Andrea Biagioni - Fabrizio Capuani

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The deployment of the next generation computing platform at ExaFlops scale requires to solve new technological challenges mainly related to the impressive number (up to 10^6) of compute elements required. This impacts on system power consumption, in terms of feasibility and costs, and on system scalability and computing efficiency. In this perspective analysis, exploration and evaluation of technologies characterized by low power, high efficiency and high degree of customization is strongly needed. Among the various European initiative targeting the design of ExaFlops system, ExaNeSt and EuroExa are EU-H2020 funded initiatives leveraging on high end MPSoC FPGAs. Last generation MPSoC FPGAs can be seen as non-mainstream but powerful HPC Exascale enabling components thanks to the integration of embedded multi-core, ARM-based low power CPUs and a huge number of hardware resources usable to co-design application oriented accelerators and to develop a low latency high bandwidth network architecture. In this paper we introduce ExaNet the FPGA-based, scalable, direct network architecture of ExaNeSt system. ExaNet allow us to explore different interconnection topologies, to evaluate advanced routing functions for congestion control and fault tolerance and to design specific hardware components for acceleration of collective operations. After a brief introduction of the motivations and goals of ExaNeSt and EuroExa projects, we will report on the status of network architecture design and its hardware/software testbed adding preliminary bandwidth and latency achievements.

rate research

A Non-anchored Unified Naming System for Ad Hoc Computing Environments

89 - Yoo Chul Chung , Dongman Lee 2006

A ubiquitous computing environment consists of many resources that need to be identified by users and applications. Users and developers require some way to identify resources by human readable names. In addition, ubiquitous computing environments impose additional requirements such as the ability to work well with ad hoc situations and the provision of names that depend on context. The Non-anchored Unified Naming (NUN) system was designed to satisfy these requirements. It is based on relative naming among resources and provides the ability to name arbitrary types of resources. By having resources themselves take part in naming, resources are able to able contribute their specialized knowledge into the name resolution process, making context-dependent mapping of names to resources possible. The ease of which new resource types can be added makes it simple to incorporate new types of contextual information within names. In this paper, we describe the naming system and evaluate its use.

Distributed Parallel and Cluster Computing Hardware Architecture Networking and Internet Architecture

Flare: Flexible In-Network Allreduce

91 - Daniele De Sensi , Salvatore Di Girolamo , Saleh Ashkboos 2021

The allreduce operation is one of the most commonly used communication routines in distributed applications. To improve its bandwidth and to reduce network traffic, this operation can be accelerated by offloading it to network switches, that aggregate the data received from the hosts, and send them back the aggregated result. However, existing solutions provide limited customization opportunities and might provide suboptimal performance when dealing with custom operators and data types, with sparse data, or when reproducibility of the aggregation is a concern. To deal with these problems, in this work we design a flexible programmable switch by using as a building block PsPIN, a RISC-V architecture implementing the sPIN programming model. We then design, model, and analyze different algorithms for executing the aggregation on this architecture, showing performance improvements compared to state-of-the-art approaches.

Distributed Parallel and Cluster Computing Hardware Architecture Networking and Internet Architecture

Application Checkpoint and Power Study on Large Scale Systems

107 - Yuping Fan 2021

Power efficiency is critical in high performance computing (HPC) systems. To achieve high power efficiency on application level, it is vital importance to efficiently distribute power used by application checkpoints. In this study, we analyze the relation of application checkpoints and their power consumption. The observations could guide the design of power management.

Software Engineering Hardware Architecture

Meta-level issues in Offloading: Scoping, Composition, Development, and their Automation

123 - Andre DeHon , Hans Giesen , Nik Sultana 2021

This paper argues for an accelerator development toolchain that takes into account the whole system containing the accelerator. With whole-system visibility, the toolchain can better assist accelerator scoping and composition in the context of the expected workloads and intended performance objectives. Despite being focused on the meta-level of accelerators, this would build on existing and ongoing DSLs and toolchains for accelerator design. Basing this on our experience in programmable networking and reconfigurable-hardware programming, we propose an integrative approach that relies on three activities: (i) generalizing the focus of acceleration to offloading to accommodate a broader variety of non-functional needs -- such as security and power use -- while using similar implementation approaches, (ii) discovering what to offload, and to what hardware, through semi-automated analysis of a whole system that might compose different offload choices that changeover time, (iii) connecting with research and state-of-the-art approaches for using domain-specific languages (DSLs) and high-level synthesis (HLS) systems for custom offload development. We outline how this integration can drive new development tooling that accepts models of programs and resources to assist system designers through design-space exploration for the accelerated system.

Distributed Parallel and Cluster Computing Hardware Architecture Networking and Internet Architecture

Introducing Neuromorphic Computing and Engineering

283 - Giacomo Indiveri 2021

The standard nature of computing is currently being challenged by a range of problems that start to hinder technological progress. One of the strategies being proposed to address some of these problems is to develop novel brain-inspired processing methods and technologies, and apply them to a wide range of application scenarios. This is an extremely challenging endeavor that requires researchers in multiple disciplines to combine their efforts and co-design at the same time the processing methods, the supporting computing architectures, and their underlying technologies. The journal ``Neuromorphic Computing and Engineering (NCE) has been launched to support this new community in this effort and provide a forum and repository for presenting and discussing its latest advances. Through close collaboration with our colleagues on the editorial team, the scope and characteristics of NCE have been designed to ensure it serves a growing transdisciplinary and dynamic community across academia and industry.

Distributed Parallel and Cluster Computing Hardware Architecture Computational Engineering

comments

Fetching comments

Hama University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Large Scale Low Power Computing System - Status of Network Design in ExaNeSt and EuroExa Projects

Ask ChatGPT about the research

No Arabic abstract

Read More