Memory-efficient w-projection with the fast Gauss transform

387 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Keith Bannister

تاريخ النشر 2013

مجال البحث فيزياء

والبحث باللغة English

تأليف Keith W. Bannister - Tim J. Cornwell

الأجهزة والأساليب للزيئات الفيزياء الفلكية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We describe a method performing w-projection using the fast Gauss transform of Strain (1991). We derive the theoretical performance, and simulate the actual performance for a range of w for a canonical array. While our implementation is dominated by overheads, we argue that this approach could for the basis of a higher-performing algorithms with particular application to the Square Kilometer Array.

قيم البحث

242 - Guy Nir , Eran O. Ofek , Sagi Ben-Ami 2021

A relatively unexplored phase space of transients and stellar variability is that of second and sub-second time-scales. We describe a new optical observatory operating in the Negev desert in Israel, with a 55 cm aperture, a field of view of 2.6x2.6 d eg (~7deg^2) equipped with a high frame rate, low read noise, CMOS camera. The system can observe at a frame rate of up to 90HZ (full frame), while nominally observations are conducted at 10-25Hz. The data, generated at a rate of over 6Gbits/s at a frame rate of 25Hz, are analyzed in real time. The observatory is fully robotic and capable of autonomously collecting data on a few thousand stars in each field each night. We present the system overview, performance metrics, science objectives, and some first results, e.g., the detection of a high rate of glints from geosynchronous satellites, reported in Nir et al. 2020.

الأجهزة والأساليب للزيئات الفيزياء الفلكية

OCTAD-S: Digital Fast Fourier Transform Spectrometers by FPGA

69 - Kazumasa Iwai , Y^uki Kubo , Hiromitsu Ishibashi 2017

We have developed a digital fast Fourier transform (FFT) spectrometer made of an analog-to-digital converter (ADC) and a field-programmable gate array (FPGA). The base instrument has independent ADC and FPGA modules, which allow us to implement diffe rent spectrometers in a relatively easy manner. Two types of spectrometers have been instrumented, one with 4.096 GS/s sampling speed and 2048 frequency channels and the other with 2.048 GS/s sampling speed and 32768 frequency channels. The signal processing in these spectrometers has no dead time and the accumulated spectra are recorded in external media every 8 ms. A direct sampling spectroscopy up to 8 GHz is achieved by a microwave track-and-hold circuit, which can reduce the analog receiver in front of the spectrometer. Highly stable spectroscopy with a wide dynamic range was demonstrated in a series of laboratory experiments and test observations of solar radio bursts.

الأجهزة والأساليب للزيئات الفيزياء الفلكية الفيزياء الفلكية الشمسية والنجوم

Revisiting the spread spectrum effect in radio interferometric imaging: a sparse variant of the w-projection algorithm

773 - L. Wolz , J. D. McEwen , F. B. Abdalla 2013

Next-generation radio interferometric telescopes will exhibit non-coplanar baseline configurations and wide field-of-views, inducing a w-modulation of the sky image, which in turn induces the spread spectrum effect. We revisit the impact of this effe ct on imaging quality and study a new algorithmic strategy to deal with the associated operator in the image reconstruction process. In previous studies it has been shown that image recovery in the framework of compressed sensing is improved due to the spread spectrum effect, where the w-modulation can act to increase the incoherence between measurement and sparsifying signal representations. For the purpose of computational efficiency, idealised experiments were performed, where only a constant baseline component w in the pointing direction of the telescope was considered. We extend this analysis to the more realistic setting where the w-component varies for each visibility measurement. Firstly, incorporating varying w-components into imaging algorithms is a computational demanding task. We propose a variant of the w-projection algorithm for this purpose, which is based on an adaptive sparsification procedure, and incorporate it in compressed sensing imaging methods. This sparse matrix variant of the w-projection algorithm is generic and adapts to the support of each kernel. Consequently, it is applicable for all types of direction-dependent effects. Secondly, we show that for w-modulation with varying w-components, reconstruction quality is significantly improved compared to the setting where there is no w-modulation (i.e. w=0), reaching levels comparable to the quality of a constant, maximal w-component. This finding confirms that one may seek to optimise future telescope configurations to promote large w-components, thus enhancing the spread spectrum effect and consequently the fidelity of image reconstruction.

الأجهزة والأساليب للزيئات الفيزياء الفلكية

Crystalline: Fast and Memory Efficient Wait-Free Reclamation

131 - Ruslan Nikolaev , Binoy Ravindran 2021

Historically, memory management based on lock-free reference counting was very inefficient, especially for read-dominated workloads. Thus, approaches such as epoch-based reclamation (EBR), hazard pointers (HP), or a combination thereof have received significant attention. EBR exhibits excellent performance but is blocking due to potentially unbounded memory usage. In contrast, HP are non-blocking and achieve good memory efficiency but are much slower. Moreover, HP are only lock-free in the general case. Recently, several new memory reclamation approaches such as WFE and Hyaline have been proposed. WFE achieves wait-freedom, but is less memory efficient and suffers from suboptimal performance in oversubscribed scenarios; Hyaline achieves higher performance and memory efficiency, but lacks wait-freedom. We present a new wait-free memory reclamation scheme, Crystalline, that simultaneously addresses the challenges of high performance, high memory efficiency, and wait-freedom. Crystalline guarantees complete wait-freedom even when threads are dynamically recycled, asynchronously reclaims memory in the sense that any thread can reclaim memory retired by any other thread, and ensures (an almost) balanced reclamation workload across all threads. The latter two properties result in Crystallines high performance and high memory efficiency. Simultaneously ensuring all three properties require overcoming unique challenges which we discuss in the paper. Crystallines implementation relies on specialized instructions which are widely available on commodity hardware such as x86-64 or ARM64. Our experimental evaluations show that Crystalline exhibits outstanding scalability and memory efficiency, and achieves superior throughput than typical reclamation schemes such as EBR as the number of threads grows.

النظم الموزعة والتوازية والحوسبة العنقودية

The Fast Kernel Transform

132 - John Paul Ryan , Sebastian Ament , Carla P. Gomes 2021

Kernel methods are a highly effective and widely used collection of modern machine learning algorithms. A fundamental limitation of virtually all such methods are computations involving the kernel matrix that naively scale quadratically (e.g., constr ucting the kernel matrix and matrix-vector multiplication) or cubically (solving linear systems) with the size of the data set $N.$ We propose the Fast Kernel Transform (FKT), a general algorithm to compute matrix-vector multiplications (MVMs) for datasets in moderate dimensions with quasilinear complexity. Typically, analytically grounded fast multiplication methods require specialized development for specific kernels. In contrast, our scheme is based on auto-differentiation and automated symbolic computations that leverage the analytical structure of the underlying kernel. This allows the FKT to be easily applied to a broad class of kernels, including Gaussian, Matern, and Rational Quadratic covariance functions and physically motivated Greens functions, including those of the Laplace and Helmholtz equations. Furthermore, the FKT maintains a high, quantifiable, and controllable level of accuracy -- properties that many acceleration methods lack. We illustrate the efficacy and versatility of the FKT by providing timing and accuracy benchmarks and by applying it to scale the stochastic neighborhood embedding (t-SNE) and Gaussian processes to large real-world data sets.

التعلم الآلي التحليل العددي التحليل العددي

سجل دخول لتتمكن من نشر تعليقات