ترغب بنشر مسار تعليمي؟ اضغط هنا

Specifying and Testing GPU Workgroup Progress Models

320   0   0.0 ( 0 )
 نشر من قبل Lucas Salvador
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

As GPU availability has increased and programming support has matured, a wider variety of applications are being ported to these platforms. Many parallel applications contain fine-grained synchronization idioms; as such, their correct execution depends on a degree of relative forward progress between threads (or thread groups). Unfortunately, many GPU programming specifications say almost nothing about relative forward progress guarantees between workgroups. Although prior work has proposed a spectrum of plausible progress models for GPUs, cross-vendor specifications have yet to commit to any model. This work is a collection of tools experimental data to aid specification designers when considering forward progress guarantees in programming frameworks. As a foundation, we formalize a small parallel programming language that captures the essence of fine-grained synchronization. We then provide a means of formally specifying a progress model, and develop a termination oracle that decides whether a given program is guaranteed to eventually terminate with respect to a given progress model. Next, we formalize a constraint for concurrent programs that require relative forward progress to terminate. Using this constraint, we synthesize a large set of 483 progress litmus tests. Combined with the termination oracle, this allows us to determine the expected status of each litmus test -- i.e. whether it is guaranteed eventual termination -- under various progress models. We present a large experimental campaign running the litmus tests across 8 GPUs from 5 different vendors. Our results highlight that GPUs have significantly different termination behaviors under our test suite. Most notably, we find that Apple and ARM GPUs do not support the linear occupancy-bound model, an intuitive progress model defined by prior work and hypothesized to describe the workgroup schedulers of existing GPUs.

قيم البحث

اقرأ أيضاً

The growing adoption of smart contracts on blockchains poses new security risks that can lead to significant monetary loss, while existing approaches either provide no (or partial) security guarantees for smart contracts or require huge proof effort. To address this challenge, we present SciviK, a versatile framework for specifying and verifying industrial-grade smart contracts. SciviKs versatile approach extends previous efforts with three key contributions: (i) an expressive annotation system enabling built-in directives for vulnerability pattern checking, neural-based loop invariant inference, and the verification of rich properties of real-world smart contracts (ii) a fine-grained model for the Ethereum Virtual Machine (EVM) that provides low-level execution semantics, (iii) an IR-level verification framework integrating both SMT solvers and the Coq proof assistant. We use SciviK to specify and verify security properties for 12 benchmark contracts and a real-world Decentralized Finance (DeFi) smart contract. Among all 158 specified security properties (in six types), 151 properties can be automatically verified within 2 seconds, five properties can be automatically verified after moderate modifications, and two properties are manually proved with around 200 lines of Coq code.
With the rapid development of scientific computation, more and more researchers and developers are committed to implementing various workloads/operations on different devices. Among all these devices, NVIDIA GPU is the most popular choice due to its comprehensive documentation and excellent development tools. As a result, there are abundant resources for hand-writing high-performance CUDA codes. However, CUDA is mainly supported by only commercial products and there has been no support for open-source H/W platforms. RISC-V is the most popular choice for hardware ISA, thanks to its elegant design and open-source license. In this project, we aim to utilize these existing CUDA codes with RISC-V devices. More specifically, we design and implement a pipeline that can execute CUDA source code on an RISC-V GPU architecture. We have succeeded in executing CUDA kernels with several important features, like multi-thread and atomic instructions, on an RISC-V GPU architecture.
This paper proposes a new logic RoCTL* to model robustness in concurrent systems. RoCTL* extends CTL* with the addition of Obligatory and Robustly operators, which quantify over failure-free paths and paths with one more failure respectively. We pres ent a number of examples of problems to which RoCTL* can be applied. The core result of this paper is to show that RoCTL* is expressively equivalent to CTL* but is non-elementarily more succinct. We present a translation from RoCTL* into CTL* that preserves truth but may result in non-elementary growth in the length of the translated formula as each nested Robustly operator may result in an extra exponential blowup. However, we show that this translation is optimal in the sense that any equivalence preserving translation will require an extra exponential growth per nested Robustly. We also compare RoCTL* to Quantified CTL* (QCTL*) and hybrid logics.
Software testing is one of the most popular validation techniques in the software industry. Surprisingly, we can only find a few approaches to testing in the context of logic programming. In this paper, we introduce a systematic approach for dynamic testing that combines both concrete and symbolic execution. Our approach is fully automatic and guarantees full path coverage when it terminates. We prove some basic properties of our technique and illustrate its practical usefulness through a prototype implementation.
The Weak Equivalence Principle (WEP) and the local Lorentz invariance (LLI) are two major assumptions of General Relativity (GR). The MICROSCOPE mission, currently operating, will perform a test of the WEP with a precision of $10^{-15}$. The data wil l also be analysed at SYRTE for the purposes of a LLI test realised in collaboration with J. Tasson (Carleton College, Minnesota) and Q. Bailey (Embry-Riddle Aeronautical University, Arizona). This study will be performed in a general framework, called the Standard Model Extension (SME), describing Lorentz violations that could appear at Planck scale ($10^{19}$ GeV). The SME allows us to derive a Lorentz violating observable designed for the MICROSCOPE experiment and to search for possible deviations from LLI in the differential acceleration of the test masses.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا