مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Functional principal components analysis via penalized rank one approximation

132 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jianhua Z. Huang

تاريخ النشر 2008

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Jianhua Z. Huang - Haipeng Shen - Andreas Buja

نظرية الإحصاء نظرية الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Two existing approaches to functional principal components analysis (FPCA) are due to Rice and Silverman (1991) and Silverman (1996), both based on maximizing variance but introducing penalization in different ways. In this article we propose an alternative approach to FPCA using penalized rank one approximation to the data matrix. Our contributions are four-fold: (1) by considering invariance under scale transformation of the measurements, the new formulation sheds light on how regularization should be performed for FPCA and suggests an efficient power algorithm for computation; (2) it naturally incorporates spline smoothing of discretized functional data; (3) the connection with smoothing splines also facilitates construction of cross-validation or generalized cross-validation criteria for smoothing parameter selection that allows efficient computation; (4) different smoothing parameters are permitted for different FPCs. The methodology is illustrated with a real data example and a simulation.

قيم البحث

354 - Xiongtao Dai , Hans-Georg Muller 2017

Functional data analysis on nonlinear manifolds has drawn recent interest. Sphere-valued functional data, which are encountered for example as movement trajectories on the surface of the earth, are an important special case. We consider an intrinsic principal component analysis for smooth Riemannian manifold-valued functional data and study its asymptotic properties. Riemannian functional principal component analysis (RFPCA) is carried out by first mapping the manifold-valued data through Riemannian logarithm maps to tangent spaces around the time-varying Frechet mean function, and then performing a classical multivariate functional principal component analysis on the linear tangent spaces. Representations of the Riemannian manifold-valued functions and the eigenfunctions on the original manifold are then obtained with exponential maps. The tangent-space approximation through functional principal component analysis is shown to be well-behaved in terms of controlling the residual variation if the Riemannian manifold has nonnegative curvature. Specifically, we derive a central limit theorem for the mean function, as well as root-$n$ uniform convergence rates for other model components, including the covariance function, eigenfunctions, and functional principal component scores. Our applications include a novel framework for the analysis of longitudinal compositional data, achieved by mapping longitudinal compositional data to trajectories on the sphere, illustrated with longitudinal fruit fly behavior patterns. RFPCA is shown to be superior in terms of trajectory recovery in comparison to an unrestricted functional principal component analysis in applications and simulations and is also found to produce principal component scores that are better predictors for classification compared to traditional functional functional principal component scores.

نظرية الإحصاء نظرية الإحصاء

Quantifying the Estimation Error of Principal Components

332 - Raphael Hauser , Raul Kangro , Juri Lember 2017

Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $widehat{Sigma}$ that approximates a populat ion covariance $Sigma$, and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population. Since PCA is based on the eigendecomposition of the proxy covariance $widehat{Sigma}$ rather than the ground-truth $Sigma$, it is important to understand the approximation error in each individual eigenvector as a function of the number of available samples. The recent results of Kolchinskii and Lounici yield such bounds. In the present paper we sharpen these bounds and show that eigenvectors can often be reconstructed to a required accuracy from a sample of strictly smaller size order.

نظرية الإحصاء نظرية الإحصاء

Bayesian Functional Principal Components Analysis via Variational Message Passing

115 - Tui H. Nolan , Jeff Goldsmith , David Ruppert 2021

Functional principal components analysis is a popular tool for inference on functional data. Standard approaches rely on an eigendecomposition of a smoothed covariance surface in order to extract the orthonormal functions representing the major modes of variation. This approach can be a computationally intensive procedure, especially in the presence of large datasets with irregular observations. In this article, we develop a Bayesian approach, which aims to determine the Karhunen-Lo`eve decomposition directly without the need to smooth and estimate a covariance surface. More specifically, we develop a variational Bayesian algorithm via message passing over a factor graph, which is more commonly referred to as variational message passing. Message passing algorithms are a powerful tool for compartmentalizing the algebra and coding required for inference in hierarchical statistical models. Recently, there has been much focus on formulating variational inference algorithms in the message passing framework because it removes the need for rederiving approximate posterior density functions if there is a change to the model. Instead, model changes are handled by changing specific computational units, known as fragments, within the factor graph. We extend the notion of variational message passing to functional principal components analysis. Indeed, this is the first article to address a functional data model via variational message passing. Our approach introduces two new fragments that are necessary for Bayesian functional principal components analysis. We present the computational details, a set of simulations for assessing accuracy and speed and an application to United States temperature data.

المنهجية

Efficient Estimation of Linear Functionals of Principal Components

128 - Vladimir Koltchinskii , Matthias Loffler , Richard Nickl 2017

We study principal component analysis (PCA) for mean zero i.i.d. Gaussian observations $X_1,dots, X_n$ in a separable Hilbert space $mathbb{H}$ with unknown covariance operator $Sigma.$ The complexity of the problem is characterized by its effective rank ${bf r}(Sigma):= frac{{rm tr}(Sigma)}{|Sigma|},$ where ${rm tr}(Sigma)$ denotes the trace of $Sigma$ and $|Sigma|$ denotes its operator norm. We develop a method of bias reduction in the problem of estimation of linear functionals of eigenvectors of $Sigma.$ Under the assumption that ${bf r}(Sigma)=o(n),$ we establish the asymptotic normality and asymptotic properties of the risk of the resulting estimators and prove matching minimax lower bounds, showing their semi-parametric optimality.

نظرية الإحصاء نظرية الإحصاء

Statistical inference for principal components of spiked covariance matrices

99 - Zhigang Bao , Xiucai Ding , Jingming Wang 2020

In this paper, we study the asymptotic behavior of the extreme eigenvalues and eigenvectors of the high dimensional spiked sample covariance matrices, in the supercritical case when a reliable detection of spikes is possible. Especially, we derive th e joint distribution of the extreme eigenvalues and the generalized components of the associated eigenvectors, i.e., the projections of the eigenvectors onto arbitrary given direction, assuming that the dimension and sample size are comparably large. In general, the joint distribution is given in terms of linear combinations of finitely many Gaussian and Chi-square variables, with parameters depending on the projection direction and the spikes. Our assumption on the spikes is fully general. First, the strengths of spikes are only required to be slightly above the critical threshold and no upper bound on the strengths is needed. Second, multiple spikes, i.e., spikes with the same strength, are allowed. Third, no structural assumption is imposed on the spikes. Thanks to the general setting, we can then apply the results to various high dimensional statistical hypothesis testing problems involving both the eigenvalues and eigenvectors. Specifically, we propose accurate and powerful statistics to conduct hypothesis testing on the principal components. These statistics are data-dependent and adaptive to the underlying true spikes. Numerical simulations also confirm the accuracy and powerfulness of our proposed statistics and illustrate significantly better performance compared to the existing methods in the literature. Especially, our methods are accurate and powerful even when either the spikes are small or the dimension is large.

نظرية الإحصاء نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني الجزائري للبحث الزراعي

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Functional principal components analysis via penalized rank one approximation

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً