No Arabic abstract
RooFit and RooStats, the toolkits for statistical modelling in ROOT, are used in most searches and measurements at the Large Hadron Collider as well as at $B$ factories. Larger datasets to be collected at e.g. the High-Luminosity LHC will enable measurements with higher precision, but will require faster data processing to keep fitting times stable. In this work, a simplification of RooFits interfaces and a redesign of its internal dataflow is presented. Interfaces are being extended to look and feel more STL-like to be more accessible both from C++ and Python to improve interoperability and ease of use, while maintaining compatibility with old code. The redesign of the dataflow improves cache locality and data loading, and can be used to process batches of data with vectorised SIMD computations. This reduces the time for computing unbinned likelihoods by a factor four to 16. This will allow to fit larger datasets of the future in the same time or faster than todays fits.
RooFit and RooStats, the toolkits for statistical modelling in ROOT, are used in most searches and measurements at the Large Hadron Collider. The data to be collected in Run 3 will enable measurements with higher precision and models with larger complexity, but also require faster data processing. In this work, first results on modernising RooFits collections, restructuring data flow and vectorising likelihood fits in RooFit will be discussed. These improvements will enable the LHC experiments to process larger datasets without having to compromise with respect to model complexity, as fitting times would increase significantly with the large datasets to be expected in Run 3.
We present algorithms for real and complex dot product and matrix multiplication in arbitrary-precision floating-point and ball arithmetic. A low-overhead dot product is implemented on the level of GMP limb arrays; it is about twice as fast as previous code in MPFR and Arb at precision up to several hundred bits. Up to 128 bits, it is 3-4 times as fast, costing 20-30 cycles per term for floating-point evaluation and 40-50 cycles per term for balls. We handle large matrix multiplications even more efficiently via blocks of scaled integer matrices. The new methods are implemented in Arb and significantly speed up polynomial operations and linear algebra.
On common processors, integer multiplication is many times faster than integer division. Dividing a numerator n by a divisor d is mathematically equivalent to multiplication by the inverse of the divisor (n / d = n x 1/d). If the divisor is known in advance---or if repeated integer divisions will be performed with the same divisor---it can be beneficial to substitute a less costly multiplication for an expensive division. Currently, the remainder of the division by a constant is computed from the quotient by a multiplication and a subtraction. But if just the remainder is desired and the quotient is unneeded, this may be suboptimal. We present a generally applicable algorithm to compute the remainder more directly. Specifically, we use the fractional portion of the product of the numerator and the inverse of the divisor. On this basis, we also present a new, simpler divisibility algorithm to detect nonzero remainders. We also derive new tight bounds on the precision required when representing the inverse of the divisor. Furthermore, we present simple C implementations that beat the optimized code produced by state-of-art C compilers on recent x64 processors (e.g., Intel Skylake and AMD Ryzen), sometimes by more than 25%. On all tested platforms including 64-bit ARM and POWER8, our divisibility-test functions are faster than state-of-the-art Granlund-Montgomery divisibility-test functions, sometimes by more than 50%.
Hydra is a header-only, templated and C++11-compliant framework designed to perform the typical bottleneck calculations found in common HEP data analyses on massively parallel platforms. The framework is implemented on top of the C++11 Standard Library and a variadic version of the Thrust library and is designed to run on Linux systems, using OpenMP, CUDA and TBB enabled devices. This contribution summarizes the main features of Hydra. A basic description of the overall design, functionality and user interface is provided, along with some code examples and measurements of performance.
There is a nationwide drive to get more girls into physics and coding, and some educators believe gaming could be a way to get girls interested in coding and STEM topics. This project, sponsored by NSF, is to create a QCD game that will raise public interest in QCD, especially among K-12 girls, and increase interest in coding among girls. Through the immersive framework of interactive gameplay, this QCD phone game will allow the public to peek into the QCD research world. The game design will fall into the Match 3 genre, which typically attracts a higher ratio of female players. The game will be implemented initially as a phone app, and the gameplay would require learning simple QCD rules to progress. By leveraging the willingness of players to engage with the rules of an entertaining game, they are able to easily learn a few principles of physics. The game is now available to download from the Google Play store (https://play.google.com/store/apps/details?id=com.gellab.quantum3) and the Apple Appstore (https://itunes.apple.com/gb/app/quantum-3/id1406630529)! We formed a development team of MSU undergraduate students to make the game and provided them with a QCD curriculum. The game will be tested at MSU outreach activities, as well as among local K-12 girls through school activities, and feedback will be used to improve the design. The final game can be easily distributed through various app stores and impact will be measured through a follow-up survey. If such a new direction works to attract more girls to coding and physics, one should develop more games to engage more girls in STEM.