ﻻ يوجد ملخص باللغة العربية
Sparse principal component analysis (PCA) is an important technique for dimensionality reduction of high-dimensional data. However, most existing sparse PCA algorithms are based on non-convex optimization, which provide little guarantee on the global convergence. Sparse PCA algorithms based on a convex formulation, for example the Fantope projection and selection (FPS), overcome this difficulty, but are computationally expensive. In this work we study sparse PCA based on the convex FPS formulation, and propose a new algorithm that is computationally efficient and applicable to large and high-dimensional data sets. Nonasymptotic and explicit bounds are derived for both the optimization error and the statistical accuracy, which can be used for testing and inference problems. We also extend our algorithm to online learning problems, where data are obtained in a streaming fashion. The proposed algorithm is applied to high-dimensional gene expression data for the detection of functional gene groups.
Sparse Principal Component Analysis (SPCA) is widely used in data processing and dimension reduction; it uses the lasso to produce modified principal components with sparse loadings for better interpretability. However, sparse PCA never considers an
A principal component analysis based on the generalized Gini correlation index is proposed (Gini PCA). The Gini PCA generalizes the standard PCA based on the variance. It is shown, in the Gaussian case, that the standard PCA is equivalent to the Gini
We present a novel technique for sparse principal component analysis. This method, named Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA), is based on the recently detailed formula for computing normed, squared eigenvector l
Sparse principal component analysis (PCA) is a popular tool for dimensional reduction of high-dimensional data. Despite its massive popularity, there is still a lack of theoretically justifiable Bayesian sparse PCA that is computationally scalable. A
Functional binary datasets occur frequently in real practice, whereas discrete characteristics of the data can bring challenges to model estimation. In this paper, we propose a sparse logistic functional principal component analysis (SLFPCA) method t