ﻻ يوجد ملخص باللغة العربية
This paper describes the development of the PALS system, an implementation of Prolog capable of efficiently exploiting or-parallelism on distributed-memory platforms--specifically Beowulf clusters. PALS makes use of a novel technique, called incremental stack-splitting. The technique proposed builds on the stack-splitting approach, previously described by the authors and experimentally validated on shared-memory systems, which in turn is an evolution of the stack-copying method used in a variety of parallel logic and constraint systems--e.g., MUSE, YAP, and Penny. The PALS system is the first distributed or-parallel implementation of Prolog based on the stack-splitting method ever realized. The results presented confirm the superiority of this method as a simple yet effective technique to transition from shared-memory to distributed-memory systems. PALS extends stack-splitting by combining it with incremental copying; the paper provides a description of the implementation of PALS, including details of how distributed scheduling is handled. We also investigate methodologies to effectively support order-sensitive predicates (e.g., side-effects) in the context of the stack-splitting scheme. Experimental results obtained from running PALS on both Shared Memory and Beowulf systems are presented and analyzed.
Consider an arbitrary network of communicating modules on a chip, each requiring a local signal telling it when to execute a computational step. There are three common solutions to generating such a local clock signal: (i) by deriving it from a singl
Fault tolerance overhead of high performance computing (HPC) applications is becoming critical to the efficient utilization of HPC systems at large scale. HPC applications typically tolerate fail-stop failures by checkpointing. Another promising meth
Game semantics is a denotational semantics presenting compositionally the computational behaviour of various kinds of effectful programs. One of its celebrated achievement is to have obtained full abstraction results for programming languages with a
Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic dat
The SIMT execution model is commonly used for general GPU development. CUDA and OpenCL developers write scalar code that is implicitly parallelized by compiler and hardware. On Intel GPUs, however, this abstraction has profound performance implicatio