No Arabic abstract
We have developed a flexible radio-frequency readout system suitable for a variety of superconducting detectors commonly used in millimeter and submillimeter astrophysics, including Kinetic Inductance detectors (KIDs), Thermal KID bolometers (TKIDs), and Quantum Capacitance Detectors (QCDs). Our system avoids custom FPGA-based readouts and instead uses commercially available software radio hardware for ADC/DAC and a GPU to handle real time signal processing. Because this system is written in common C++/CUDA, the range of different algorithms that can be quickly implemented make it suitable for the readout of many others cryogenic detectors and for the testing of different and possibly more effective data acquisition schemes.
We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh Refinement code), which has adopted a novel approach to improve the performance of adaptive mesh refinement (AMR) astrophysical simulations by a large factor with the use of the graphic processing unit (GPU). The AMR implementation is based on a hierarchy of grid patches with an oct-tree data structure. We adopt a three-dimensional relaxing TVD scheme for the hydrodynamic solver, and a multi-level relaxation scheme for the Poisson solver. Both solvers have been implemented in GPU, by which hundreds of patches can be advanced in parallel. The computational overhead associated with the data transfer between CPU and GPU is carefully reduced by utilizing the capability of asynchronous memory copies in GPU, and the computing time of the ghost-zone values for each patch is made to diminish by overlapping it with the GPU computations. We demonstrate the accuracy of the code by performing several standard test problems in astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster system. We measure the performance of the code by performing purely-baryonic cosmological simulations in different hardware implementations, in which detailed timing analyses provide comparison between the computations with and without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are demonstrated using 1 GPU with 4096^3 effective resolution and 16 GPUs with 8192^3 effective resolution, respectively.
Rotation measure (RM) synthesis is a widely used polarization processing algorithm for reconstructing polarized structures along the line of sight. Performing RM synthesis on large datasets produced by telescopes like LOFAR can be computationally intensive as the computational cost is proportional to the product of the number of input frequency channels, the number of output Faraday depth values to be evaluated and the number of lines of sight present in the data cube. The required computational cost is likely to get worse due to the planned large area sky surveys with telescopes like the Low Frequency Array (LOFAR), the Murchison Widefield Array (MWA), and eventually the Square Kilometre Array (SKA). The massively parallel General Purpose Graphical Processing Units (GPGPUs) can be used to execute some of the computationally intensive astronomical image processing algorithms including RM synthesis. In this paper, we present a GPU-accelerated code, called cuFFS or CUDA-accelerated Fast Faraday Synthesis, to perform Faraday rotation measure synthesis. Compared to a fast single-threaded and vectorized CPU implementation, depending on the structure and format of the data cubes, our code achieves an increase in speed of up to two orders of magnitude. During testing, we noticed that the disk I/O when using the Flexible Image Transport System (FITS) data format is a major bottleneck and to reduce the time spent on disk I/O, our code supports the faster HDFITS format in addition to the standard FITS format. The code is written in C with GPU-acceleration achieved using Nvidias CUDA parallel computing platform. The code is available at https://github.com/sarrvesh/cuFFS.
While the radio detection of cosmic rays has advanced to a standard method in astroparticle physics, the radio detection of neutrinos is just about to start its full bloom. The successes of pilot-arrays have to be accompanied by the development of modern and flexible software tools to ensure rapid progress in reconstruction algorithms and data processing. We present NuRadioReco as such a modern Python-based data analysis tool. It includes a suitable data-structure, a database-implementation of a time-dependent detector, modern browser-based data visualization tools, and fully separated analysis modules. We describe the framework and examples, as well as new reconstruction algorithms to obtain the full three-dimensional electric field from distributed antennas which is needed for high-precision energy reconstruction of particle showers.
We present GAMER-2, a GPU-accelerated adaptive mesh refinement (AMR) code for astrophysics. It provides a rich set of features, including adaptive time-stepping, several hydrodynamic schemes, magnetohydrodynamics, self-gravity, particles, star formation, chemistry and radiative processes with GRACKLE, data analysis with yt, and memory pool for efficient object allocation. GAMER-2 is fully bitwise reproducible. For the performance optimization, it adopts hybrid OpenMP/MPI/GPU parallelization and utilizes overlapping CPU computation, GPU computation, and CPU-GPU communication. Load balancing is achieved using a Hilbert space-filling curve on a level-by-level basis without the need to duplicate the entire AMR hierarchy on each MPI process. To provide convincing demonstrations of the accuracy and performance of GAMER-2, we directly compare with Enzo on isolated disk galaxy simulations and with FLASH on galaxy cluster merger simulations. We show that the physical results obtained by different codes are in very good agreement, and GAMER-2 outperforms Enzo and FLASH by nearly one and two orders of magnitude, respectively, on the Blue Waters supercomputers using $1-256$ nodes. More importantly, GAMER-2 exhibits similar or even better parallel scalability compared to the other two codes. We also demonstrate good weak and strong scaling using up to 4096 GPUs and 65,536 CPU cores, and achieve a uniform resolution as high as $10{,}240^3$ cells. Furthermore, GAMER-2 can be adopted as an AMR+GPUs framework and has been extensively used for the wave dark matter ($psi$DM) simulations. GAMER-2 is open source (available at https://github.com/gamer-project/gamer) and new contributions are welcome.
Microwave Kinetic Inductance Detectors (MKIDs) have great potential for large very sensitive detector arrays for use in, for example, sub-mm imaging. Being intrinsically readout in the frequency domain, they are particularly suited for frequency domain multiplexing allowing $sim$1000s of devices to be readout with one pair of coaxial cables. However, this moves the complexity of the detector from the cryogenics to the warm electronics. We present here the concept and experimental demonstration of the use of Fast Fourier Transform Spectrometer (FFTS) readout, showing no deterioration of the noise performance compared to low noise analog mixing while allowing high multiplexing ratios.