ﻻ يوجد ملخص باللغة العربية
This work presents a heterogeneous communication library for clusters of processors and FPGAs. This library, Shoal, supports the Partitioned Global Address Space (PGAS) memory model for applications. PGAS is a shared memory model for clusters that creates a distinction between local and remote memory access. Through Shoal and its common application programming interface for hardware and software, applications can be more freely migrated to the optimal platform and deployed onto dynamic cluster topologies. The library is tested using a thorough suite of microbenchmarks to establish latency and throughput performance. We also show an implementation of the Jacobi iterative method that demonstrates the ease with which applications can be moved between platforms to yield faster run times. Through this work, we have demonstrated the feasibility of using a PGAS programming model for multi-node heterogeneous platforms.
A common paradigm for scientific computing is distributed message-passing systems, and a common approach to these systems is to implement them across clusters of high-performance workstations. As multi-core architectures become increasingly mainstrea
In the last few years, distributed machine learning has been usually executed over heterogeneous networks such as a local area network within a multi-tenant cluster or a wide area network connecting data centers and edge clusters. In these heterogene
Recently, Graph Neural Networks (GNNs) have received a lot of interest because of their success in learning representations from graph structured data. However, GNNs exhibit different compute and memory characteristics compared to traditional Deep Ne
Graphics Processing Units (GPUs) have been widely used to accelerate artificial intelligence, physics simulation, medical imaging, and information visualization applications. To improve GPU performance, GPU hardware designers need to identify perform
The rapidly growing popularity and scale of data-parallel workloads demand a corresponding increase in raw computational power of GPUs (Graphics Processing Units). As single-GPU systems struggle to satisfy the performance demands, multi-GPU systems h