ترغب بنشر مسار تعليمي؟ اضغط هنا

Efficient SIMD RNG for Varying-Parameter Streams: C++ Class BatchRNG

203   0   0.0 ( 0 )
 نشر من قبل Alireza Mahani
 تاريخ النشر 2014
والبحث باللغة English




اسأل ChatGPT حول البحث

Single-Instruction, Multiple-Data (SIMD) random number generators (RNGs) take advantage of vector units to offer significant performance gain over non-vectorized libraries, but they often rely on batch production of deviates from distributions with fixed parameters. In many statistical applications such as Gibbs sampling, parameters of sampled distributions change from one iteration to the next, requiring that random deviates be generated one-at-a-time. This situation can render vectorized RNGs inefficient, and even inferior to their scalar counterparts. The C++ class BatchRNG uses buffers of base distributions such uniform, Gaussian and exponential to take advantage of vector units while allowing for sequences of deviates to be generated with varying parameters. These small buffers are consumed and replenished as needed during a program execution. Performance tests using Intel Vector Statistical Library (VSL) on various probability distributions illustrates the effectiveness of the proposed batching strategy.

قيم البحث

اقرأ أيضاً

The SIMT execution model is commonly used for general GPU development. CUDA and OpenCL developers write scalar code that is implicitly parallelized by compiler and hardware. On Intel GPUs, however, this abstraction has profound performance implicatio ns as the underlying ISA is SIMD and important hardware capabilities cannot be fully utilized. To close this performance gap we introduce C-For-Metal (CM), an explicit SIMD programming framework designed to deliver close-to-the-metal performance on Intel GPUs. The CM programming language and its vector/matrix types provide an intuitive interface to exploit the underlying hardware features, allowing fine-grained register management, SIMD size control and cross-lane data sharing. Experimental results show that CM applications from different domains outperform the best-known SIMT-based OpenCL implementations, achieving up to 2.7x speedup on the latest Intel GPU.
We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing. While many multivariate independence tests have R packages available, the interfaces are inconsistent and most are not available in Python. hyppo includes many state of the art multivariate testing procedures. The package is easy-to-use and is flexible enough to enable future extensions. The documentation and all releases are available at https://hyppo.neurodata.io.
Random variables and their distributions are a central part in many areas of statistical methods. The Distributions.jl package provides Julia users and developers tools for working with probability distributions, leveraging Julia features for their i ntuitive and flexible manipulation, while remaining highly efficient through zero-cost abstractions.
In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order application s on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED software stack, including the libCEED, MAGMA, MFEM, libParanumal, and Nek projects. We report performance and capability improvements in several CEED-enabled applications on both NVIDIA and AMD GPU systems.
Motivated by the evidence that real-world networks evolve in time and may exhibit non-stationary features, we propose an extension of the Exponential Random Graph Models (ERGMs) accommodating the time variation of network parameters. Within the ERGM framework, a network realization is sampled from a static probability distribution defined parametrically in terms of network statistics. Inspired by the fast growing literature on Dynamic Conditional Score-driven models, in our approach, each parameter evolves according to an updating rule driven by the score of the conditional distribution. We demonstrate the flexibility of the score-driven ERGMs, both as data generating processes and as filters, and we prove the advantages of the dynamic version with respect to the static one. Our method captures dynamical network dependencies, that emerge from the data, and allows for a test discriminating between static or time-varying parameters. Finally, we corroborate our findings with the application to networks from real financial and political systems exhibiting non stationary dynamics.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا