ﻻ يوجد ملخص باللغة العربية
Modern high-end machines feature multiple processor packages, each of which contains multiple independent cores and integrated memory controllers connected directly to dedicated physical RAM. These packages are connected via a shared bus, creating a system with a heterogeneous memory hierarchy. Since this shared bus has less bandwidth than the sum of the links to memory, aggregate memory bandwidth is higher when parallel threads all access memory local to their processor package than when they access memory attached to a remote package. This bandwidth limitation has traditionally limited the scalability of modern functional language implementations, which seldom scale well past 8 cores, even on small benchmarks. This work presents a garbage collector integrated with our strict, parallel functional language implementation, Manticore, and shows that it scales effectively on both a 48-core AMD Opteron machine and a 32-core Intel Xeon machine.
We present a general technique for garbage collecting o
In this paper we are interested in bounding the number of instructions taken to process transactions. The main result is a multiversion transactional system that supports constant delay (extra instructions beyond running in isolation) for all read-on
Based on the two observations that diverse applications perform better on different multicore architectures, and that different phases of an application may have vastly different resource requirements, Pal et al. proposed a novel reconfigurable hardw
We demonstrate that general-purpose memory allocation involving many threads on many cores can be done with high performance, multicore scalability, and low memory consumption. For this purpose, we have designed and implemented scalloc, a concurrent
The functional correspondence is a manual derivation technique transforming higher-order evaluators into the semantically equivalent abstract machines. The transformation consists of two well-known program transformations: translation to continuation