Feature Grouping and Sparse Principal Component Analysis


Abstract in English

Sparse Principal Component Analysis (SPCA) is widely used in data processing and dimension reduction; it uses the lasso to produce modified principal components with sparse loadings for better interpretability. However, sparse PCA never considers an additional grouping structure where the loadings share similar coefficients (i.e., feature grouping), besides a special group with all coefficients being zero (i.e., feature selection). In this paper, we propose a novel method called Feature Grouping and Sparse Principal Component Analysis (FGSPCA) which allows the loadings to belong to disjoint homogeneous groups, with sparsity as a special case. The proposed FGSPCA is a subspace learning method designed to simultaneously perform grouping pursuit and feature selection, by imposing a non-convex regularization with naturally adjustable sparsity and grouping effect. To solve the resulting non-convex optimization problem, we propose an alternating algorithm that incorporates the difference-of-convex programming, augmented Lagrange and coordinate descent methods. Additionally, the experimental results on real data sets show that the proposed FGSPCA benefits from the grouping effect compared with methods without grouping effect.

Download