ﻻ يوجد ملخص باللغة العربية
Some important problems, such as semantic graph analysis, require large-scale irregular applications composed of many coordinating tasks that operate on a shared data set so big it has to be stored on many physical devices. In these cases, it may be more efficient to dynamically choose where code runs as the applications progresses. Many programming environments provide task migration or remote function calls, but they have sharp trade-offs between flexible composition, portability, performance, and code complexity. We developed Two-Chains, a high performance framework inspired by active message communication semantics. We use the GNU Binutils, the ELF binary format, and the RDMA network protocol to provide ultra-low granularity distributed function composition at runtime in user space at HPC performance levels using C libraries. Our framework allows the direct injection of function binaries and data to a remote machine cache using the RDMA network. It interoperates seamlessly with existing C libraries using standard dynamic linking and load symbol resolution. We analyze function delivery and execution on cache stashing-enabled hardware and show that stashing decreases latency, increases message rates, and improves noise tolerance. This demonstrates one way this method is suited to increasingly network-oriented hardware architectures.
Graphics Processing Units (GPUs) have been widely used to accelerate artificial intelligence, physics simulation, medical imaging, and information visualization applications. To improve GPU performance, GPU hardware designers need to identify perform
Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance com
To harness the potential of advanced computing technologies, efficient (real time) analysis of large amounts of data is as essential as are front-line simulations. In order to optimise this process, experts need to be supported by appropriate tools t
Ad-hoc networks, a promising trend in wireless technology, fail to work properly in a global setting. In most cases, self-organization and cost-free local communication cannot compensate the need for being connected, gathering urgent information just
Programming current supercomputers efficiently is a challenging task. Multiple levels of parallelism on the core, on the compute node, and between nodes need to be exploited to make full use of the system. Heterogeneous hardware architectures with ac