No Arabic abstract
Recently years, the attempts on distilling mobile data into useful knowledge has been led to the deployment of machine learning algorithms at the network edge. Principal component analysis (PCA) is a classic technique for extracting the linear structure of a dataset, which is useful for feature extraction and data compression. In this work, we propose the deployment of distributed PCA over a multi-access channel based on the algorithm of stochastic gradient descent to learn the dominant feature space of a distributed dataset at multiple devices. Over-the-air aggregation is adopted to reduce the multi-access latency, giving the name over-the-air PCA. The novelty of this design lies in exploiting channel noise to accelerate the descent in the region around each saddle point encountered by gradient descent, thereby increasing the convergence speed of over-the-air PCA. The idea is materialized by proposing a power-control scheme which detects the type of descent region and controlling the level of channel noise accordingly. The scheme is proved to achieve a faster convergence rate than in the case without power control.
Channel-state-information (CSI) feedback methods are considered, especially for massive or very large-scale multiple-input multiple-output (MIMO) systems. To extract essential information from the CSI without redundancy that arises from the highly correlated antennas, a receiver transforms (sparsifies) a correlated CSI vector to an uncorrelated sparse CSI vector by using a Karhunen-Loeve transform (KLT) matrix that consists of the eigen vectors of covariance matrix of CSI vector and feeds back the essential components of the sparse CSI, i.e., a principal component analysis method. A transmitter then recovers the original CSI through the inverse transformation of the feedback vector. Herein, to obtain the covariance matrix at transceiver, we derive analytically the covariance matrix of spatially correlated Rayleigh fading channels based on its statistics including transmit antennas and receive antennas correlation matrices, channel variance, and channel delay profile. With the knowledge of the channel statistics, the transceiver can readily obtain the covariance matrix and KLT matrix. Compression feedback error and bit-error-rate performance of the proposed method are analyzed. Numerical results verify that the proposed method is promising, which reduces significantly the feedback overhead of the massive-MIMO systems with marginal performance degradation from full-CSI feedback (e.g., feedback amount reduction by 80%, i.e., 1/5 of original CSI, with spectral efficiency reduction by only 2%). Furthermore, we show numerically that, for a given limited feedback amount, we can find the optimal number of transmit antennas to achieve the largest spectral efficiency, which is a new design framework.
Over-the-air computation (OAC) is a promising technique to realize fast model aggregation in the uplink of federated edge learning. OAC, however, hinges on accurate channel-gain precoding and strict synchronization among the edge devices, which are challenging in practice. As such, how to design the maximum likelihood (ML) estimator in the presence of residual channel-gain mismatch and asynchronies is an open problem. To fill this gap, this paper formulates the problem of misaligned OAC for federated edge learning and puts forth a whitened matched filtering and sampling scheme to obtain oversampled, but independent, samples from the misaligned and overlapped signals. Given the whitened samples, a sum-product ML estimator and an aligned-sample estimator are devised to estimate the arithmetic sum of the transmitted symbols. In particular, the computational complexity of our sum-product ML estimator is linear in the packet length and hence is significantly lower than the conventional ML estimator. Extensive simulations on the test accuracy versus the average received energy per symbol to noise power spectral density ratio (EsN0) yield two main results: 1) In the low EsN0 regime, the aligned-sample estimator can achieve superior test accuracy provided that the phase misalignment is non-severe. In contrast, the ML estimator does not work well due to the error propagation and noise enhancement in the estimation process. 2) In the high EsN0 regime, the ML estimator attains the optimal learning performance regardless of the severity of phase misalignment. On the other hand, the aligned-sample estimator suffers from a test-accuracy loss caused by phase misalignment.
Principal component analysis is an important dimension reduction technique in machine learning. In [S. Lloyd, M. Mohseni and P. Rebentrost, Nature Physics 10, 631-633, (2014)], a quantum algorithm to implement principal component analysis on quantum computer was obtained by computing the Hamiltonian simulation of unknown density operators. The complexity is $O((log d)t^2/epsilon)$, where $d$ is the dimension, $t$ is the evolution time and $epsilon$ is the precision. We improve this result into $O((log d)t^{1+frac{1}{k}}/epsilon^{frac{1}{k}})$ for arbitrary constant integer $kgeq 1$. As a result, we show that the Hamiltonian simulation of low-rank dense Hermitian matrices can be implemented in the same time.
We show how to efficiently project a vector onto the top principal components of a matrix, without explicitly computing these components. Specifically, we introduce an iterative algorithm that provably computes the projection using few calls to any black-box routine for ridge regression. By avoiding explicit principal component analysis (PCA), our algorithm is the first with no runtime dependence on the number of top principal components. We show that it can be used to give a fast iterative method for the popular principal component regression problem, giving the first major runtime improvement over the naive method of combining PCA with regression. To achieve our results, we first observe that ridge regression can be used to obtain a smooth projection onto the top principal components. We then sharpen this approximation to true projection using a low-degree polynomial approximation to the matrix step function. Step function approximation is a topic of long-term interest in scientific computing. We extend prior theory by constructing polynomials with simple iterative structure and rigorously analyzing their behavior under limited precision.
Motivation: Although principal component analysis is frequently applied to reduce the dimensionality of matrix data, the method is sensitive to noise and bias and has difficulty with comparability and interpretation. These issues are addressed by improving the fidelity to the study design. Principal axes and the components for variables are found through the arrangement of the training data set, and the centers of data are found according to the design. By using both the axes and the center, components for an observation that belong to various studies can be separately estimated. Both of the components for variables and observations are scaled to a unit length, which enables relationships to be seen between them. Results: Analyses in transcriptome studies showed an improvement in the separation of experimental groups and in robustness to bias and noise. Unknown samples were appropriately classified on predetermined axes. These axes well reflected the study design, and this facilitated the interpretation. Together, the introduced concepts resulted in improved generality and objectivity in the analytical results, with the ability to locate hidden structures in the data.