ترغب بنشر مسار تعليمي؟ اضغط هنا

A robust principal component analysis for outlier identification in messy microcalorimeter data

142   0   0.0 ( 0 )
 نشر من قبل Joseph Fowler
 تاريخ النشر 2019
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

A principal component analysis (PCA) of clean microcalorimeter pulse records can be a first step beyond statistically optimal linear filtering of pulses towards a fully non-linear analysis. For PCA to be practical on spectrometers with hundreds of sensors, an automated identification of clean pulses is required. Robust forms of PCA are the subject of active research in machine learning. We examine a version known as coherence pursuit that is simple, fast, and well matched to the automatic identification of outlier records, as needed for microcalorimeter pulse analysis.



قيم البحث

اقرأ أيضاً

CUPID-Mo is a cryogenic detector array designed to search for neutrinoless double-beta decay ($0 ubetabeta$) of $^{100}$Mo. It uses 20 scintillating $^{100}$Mo-enriched Li$_2$MoO$_4$ bolometers instrumented with Ge light detectors to perform active s uppression of $alpha$ backgrounds, drastically reducing the expected background in the $0 ubetabeta$ signal region. As a result, pileup events and small detector instabilities that mimic normal signals become non-negligible potential backgrounds. These types of events can in principle be eliminated based on their signal shapes, which are different from those of regular bolometric pulses. We show that a purely data-driven principal component analysis based approach is able to filter out these anomalous events, without the aid of detector response simulations.
114 - B. Alpert , E. Ferri , D. Bennett 2015
For experiments with high arrival rates, reliable identification of nearly-coincident events can be crucial. For calorimetric measurements to directly measure the neutrino mass such as HOLMES, unidentified pulse pile-ups are expected to be a leading source of experimental error. Although Wiener filtering can be used to recognize pile-up, it suffers errors due to pulse-shape variation from detector nonlinearity, readout dependence on sub-sample arrival times, and stability issues from the ill-posed deconvolution problem of recovering Dirac delta-functions from smooth data. Due to these factors, we have developed a processing method that exploits singular value decomposition to (1) separate single-pulse records from piled-up records in training data and (2) construct a model of single-pulse records that accounts for varying pulse shape with amplitude, arrival time, and baseline level, suitable for detecting nearly-coincident events. We show that the resulting processing advances can reduce the required performance specifications of the detectors and readout system or, equivalently, enable larger sensor arrays and better constraints on the neutrino mass.
93 - D. J. Mikkelson 2002
The overall design of the Integrated Spectral Analysis Workbench (ISAW), being developed at Argonne, provides for an extensible, highly interactive, collaborating set of viewers for neutron scattering data. Large arbitrary collections of spectra from multiple detectors can be viewed as an image, a scrolled list of individual graphs, or using a 3D representation of the instrument showing the detector positions. Data from an area detector can be displayed using a contour or intensity map as well as an interactive table. Selected spectra can be displayed in tables or on a conventional graph. A unique characteristic of these viewers is their interactivity and coordination. The position pointed at by the user in one viewer is sent to other viewers of the same DataSet so they can track the position and display relevant information. Specialized viewers for single crystal neutron diffractometers are being developed. A proof-of-concept viewer that directly displays the 3D reciprocal lattice from a complete series of runs on a single crystal diffractometer has been implemented.
Functional principal component analysis is essential in functional data analysis, but the inferences will become unconvincing when some non-Gaussian characteristics occur, such as heavy tail and skewness. The focus of this paper is to develop a robus t functional principal component analysis methodology in dealing with non-Gaussian longitudinal data, for which sparsity and irregularity along with non-negligible measurement errors must be considered. We introduce a Kendalls $tau$ function whose particular properties make it a nice proxy for the covariance function in the eigenequation when handling non-Gaussian cases. Moreover, the estimation procedure is presented and the asymptotic theory is also established. We further demonstrate the superiority and robustness of our method through simulation studies and apply the method to the longitudinal CD4 cell count data in an AIDS study.
Fan et al. [$mathit{Annals}$ $mathit{of}$ $mathit{Statistics}$ $textbf{47}$(6) (2019) 3009-3031] proposed a distributed principal component analysis (PCA) algorithm to significantly reduce the communication cost between multiple servers. In this pape r, we robustify their distributed algorithm by using robust covariance matrix estimators respectively proposed by Minsker [$mathit{Annals}$ $mathit{of}$ $mathit{Statistics}$ $textbf{46}$(6A) (2018) 2871-2903] and Ke et al. [$mathit{Statistical}$ $mathit{Science}$ $textbf{34}$(3) (2019) 454-471] instead of the sample covariance matrix. We extend the deviation bound of robust covariance estimators with bounded fourth moments to the case of the heavy-tailed distribution under only bounded $2+epsilon$ moments assumption. The theoretical results show that after the shrinkage or truncation treatment for the sample covariance matrix, the statistical error rate of the final estimator produced by the robust algorithm is the same as that of sub-Gaussian tails, when $epsilon geq 2$ and the sampling distribution is symmetric innovation. While $2 > epsilon >0$, the rate with respect to the sample size of each server is slower than that of the bounded fourth moment assumption. Extensive numerical results support the theoretical analysis, and indicate that the algorithm performs better than the original distributed algorithm and is robust to heavy-tailed data and outliers.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا