Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA)


Abstract in English

We present a novel technique for sparse principal component analysis. This method, named Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA), is based on the recently detailed formula for computing normed, squared eigenvector loadings of a Hermitian matrix from the eigenvalues of the full matrix and associated sub-matrices. Relative to the state-of-the-art LASSO-based sparse PCA method of Witten, Tibshirani and Hastie, the EESPCA technique offers a two-orders-of-magnitude improvement in computational speed, does not require estimation of tuning parameters, and can more accurately identify true zero principal component loadings across a range of data matrix sizes and covariance structures. Importantly, EESPCA achieves these performance benefits while maintaining a reconstruction error close to that generated by the Witten et al. approach. EESPCA is a practical and effective technique for sparse PCA with particular relevance to computationally demanding problems such as the analysis of large data matrices or statistical techniques like resampling that involve the repeated application of sparse PCA.

Download