New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A robust principal component analysis for outlier identification in messy microcalorimeter data

142 0 0.0 ( 0 )

Download Cite

Added by Joseph Fowler

Publication date 2019

fields Physics

and research's language is English

Authors J.W. Fowler - B. K. Alpert - Y.-I. Joe

Data Analysis Statistics and Probability Instrumentation and Detectors

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

A principal component analysis (PCA) of clean microcalorimeter pulse records can be a first step beyond statistically optimal linear filtering of pulses towards a fully non-linear analysis. For PCA to be practical on spectrometers with hundreds of sensors, an automated identification of clean pulses is required. Robust forms of PCA are the subject of active research in machine learning. We examine a version known as coherence pursuit that is simple, fast, and well matched to the automatic identification of outlier records, as needed for microcalorimeter pulse analysis.

rate research

Pulse Shape Discrimination in CUPID-Mo using Principal Component Analysis

104 - R. Huang , E. Armengaud , C. Augier 2020

CUPID-Mo is a cryogenic detector array designed to search for neutrinoless double-beta decay ($0 ubetabeta$) of $^{100}$Mo. It uses 20 scintillating $^{100}$Mo-enriched Li$_2$MoO$_4$ bolometers instrumented with Ge light detectors to perform active suppression of $alpha$ backgrounds, drastically reducing the expected background in the $0 ubetabeta$ signal region. As a result, pileup events and small detector instabilities that mimic normal signals become non-negligible potential backgrounds. These types of events can in principle be eliminated based on their signal shapes, which are different from those of regular bolometric pulses. We show that a purely data-driven principal component analysis based approach is able to filter out these anomalous events, without the aid of detector response simulations.

Data Analysis Statistics and Probability Nuclear Experiment Instrumentation and Detectors

Algorithms for Identification of Nearly-Coincident Events in Calorimetric Sensors

114 - B. Alpert , E. Ferri , D. Bennett 2015

For experiments with high arrival rates, reliable identification of nearly-coincident events can be crucial. For calorimetric measurements to directly measure the neutrino mass such as HOLMES, unidentified pulse pile-ups are expected to be a leading source of experimental error. Although Wiener filtering can be used to recognize pile-up, it suffers errors due to pulse-shape variation from detector nonlinearity, readout dependence on sub-sample arrival times, and stability issues from the ill-posed deconvolution problem of recovering Dirac delta-functions from smooth data. Due to these factors, we have developed a processing method that exploits singular value decomposition to (1) separate single-pulse records from piled-up records in training data and (2) construct a model of single-pulse records that accounts for varying pulse shape with amplitude, arrival time, and baseline level, suitable for detecting nearly-coincident events. We show that the resulting processing advances can reduce the required performance specifications of the detectors and readout system or, equivalently, enable larger sensor arrays and better constraints on the neutrino mass.

Data Analysis Statistics and Probability Instrumentation and Detectors

Coordinated, Interactive Data Visualization for Neutron Scattering Data

93 - D. J. Mikkelson 2002

The overall design of the Integrated Spectral Analysis Workbench (ISAW), being developed at Argonne, provides for an extensible, highly interactive, collaborating set of viewers for neutron scattering data. Large arbitrary collections of spectra from multiple detectors can be viewed as an image, a scrolled list of individual graphs, or using a 3D representation of the instrument showing the detector positions. Data from an area detector can be displayed using a contour or intensity map as well as an interactive table. Selected spectra can be displayed in tables or on a conventional graph. A unique characteristic of these viewers is their interactivity and coordination. The position pointed at by the user in one viewer is sent to other viewers of the same DataSet so they can track the position and display relevant information. Specialized viewers for single crystal neutron diffractometers are being developed. A proof-of-concept viewer that directly displays the 3D reciprocal lattice from a complete series of runs on a single crystal diffractometer has been implemented.

Data Analysis Statistics and Probability Instrumentation and Detectors

Robust Functional Principal Component Analysis for Non-Gaussian Longitudinal Data

89 - Rou Zhong , Shishi Liu , Jingxiao Zhang 2021

Functional principal component analysis is essential in functional data analysis, but the inferences will become unconvincing when some non-Gaussian characteristics occur, such as heavy tail and skewness. The focus of this paper is to develop a robust functional principal component analysis methodology in dealing with non-Gaussian longitudinal data, for which sparsity and irregularity along with non-negligible measurement errors must be considered. We introduce a Kendalls $tau$ function whose particular properties make it a nice proxy for the covariance function in the eigenequation when handling non-Gaussian cases. Moreover, the estimation procedure is presented and the asymptotic theory is also established. We further demonstrate the superiority and robustness of our method through simulation studies and apply the method to the longitudinal CD4 cell count data in an AIDS study.

Methodology

Robust covariance estimation for distributed principal component analysis

106 - Kangqiang Li , Han Bao , Songqiao Tang 2020

Fan et al. [$mathit{Annals}$ $mathit{of}$ $mathit{Statistics}$ $textbf{47}$(6) (2019) 3009-3031] proposed a distributed principal component analysis (PCA) algorithm to significantly reduce the communication cost between multiple servers. In this paper, we robustify their distributed algorithm by using robust covariance matrix estimators respectively proposed by Minsker [$mathit{Annals}$ $mathit{of}$ $mathit{Statistics}$ $textbf{46}$(6A) (2018) 2871-2903] and Ke et al. [$mathit{Statistical}$ $mathit{Science}$ $textbf{34}$(3) (2019) 454-471] instead of the sample covariance matrix. We extend the deviation bound of robust covariance estimators with bounded fourth moments to the case of the heavy-tailed distribution under only bounded $2+epsilon$ moments assumption. The theoretical results show that after the shrinkage or truncation treatment for the sample covariance matrix, the statistical error rate of the final estimator produced by the robust algorithm is the same as that of sub-Gaussian tails, when $epsilon geq 2$ and the sampling distribution is symmetric innovation. While $2 > epsilon >0$, the rate with respect to the sample size of each server is slower than that of the bounded fourth moment assumption. Extensive numerical results support the theoretical analysis, and indicate that the algorithm performs better than the original distributed algorithm and is robust to heavy-tailed data and outliers.

Statistics Theory Statistics Theory

comments

Fetching comments

Syrian Virtual University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A robust principal component analysis for outlier identification in messy microcalorimeter data

Ask ChatGPT about the research

No Arabic abstract

Read More