ﻻ يوجد ملخص باللغة العربية
In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as one-hot encoded vectors. For example, given 8 distinct values, we map each value to a byte where only a single bit has been set. We are motivated to quickly compute statistics over such encodings. Given a stream of k-bit words, we seek to compute k distinct sums corresponding to bit values at indexes 0, 1, 2, ..., k-1. If the k-bit words are one-hot encoded then the sums correspond to a frequency histogram. This multiple-sum problem is a generalization of the population-count problem where we seek the sum of all bit values. Accordingly, we refer to the multiple-sum problem as a positional population-count. Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16-bit position population count using less than half of a CPU cycle per 16-bit word. Our best approach uses up to 400 times fewer instructions and is up to 50 times faster than baseline code using only regular (non-SIMD) instructions, for sufficiently large inputs.
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g., popcnt
A visibility algorithm maps time series into complex networks following a simple criterion. The resulting visibility graph has recently proven to be a powerful tool for time series analysis. However its straightforward computation is time-consuming a
Network reliability is an important metric to evaluate the connectivity among given vertices in uncertain graphs. Since the network reliability problem is known as #P-complete, existing studies have used approximation techniques. In this paper, we pr
This work describes the SIMD vectorization of the force calculation of the Lennard-Jones potential with Intel AVX2 and AVX-512 instruction sets. Since the force-calculation kernel of the molecular dynamics method involves indirect access to memory, t
We present a new algorithm for computing balanced flows in equality networks arising in market equilibrium computations. The current best time bound for computing balanced flows in such networks requires $O(n)$ maxflow computations, where $n$ is the