Exponential Error Convergence in Data Classification with Optimized Random Features: Acceleration by Quantum Machine Learning

141 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hayata Yamasaki

تاريخ النشر 2021

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف Hayata Yamasaki - Sho Sonoda

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Random features are a central technique for scalable learning algorithms based on kernel methods. A recent work has shown that an algorithm for machine learning by quantum computer, quantum machine learning (QML), can exponentially speed up sampling of optimized random features, even without imposing restrictive assumptions on sparsity and low-rankness of matrices that had limited applicability of conventional QML algorithms; this QML algorithm makes it possible to significantly reduce and provably minimize the required number of features for regression tasks. However, a major interest in the field of QML is how widely the advantages of quantum computation can be exploited, not only in the regression tasks. We here construct a QML algorithm for a classification task accelerated by the optimized random features. We prove that the QML algorithm for sampling optimized random features, combined with stochastic gradient descent (SGD), can achieve state-of-the-art exponential convergence speed of reducing classification error in a classification task under a low-noise condition; at the same time, our algorithm with optimized random features can take advantage of the significant reduction of the required number of features so as to accelerate each iteration in the SGD and evaluation of the classifier obtained from our algorithm. These results discover a promising application of QML to significant acceleration of the leading classification algorithm based on kernel methods, without ruining its applicability to a practical class of data sets and the exponential error-convergence speed.

قيم البحث

102 - Hayata Yamasaki , Sathyawageeswar Subramanian , Sho Sonoda 2020

Kernel methods augmented with random features give scalable algorithms for learning from big data. But it has been computationally hard to sample random features according to a probability distribution that is optimized for the data, so as to minimiz e the required number of features for achieving the learning to a desired accuracy. Here, we develop a quantum algorithm for sampling from this optimized distribution over features, in runtime $O(D)$ that is linear in the dimension $D$ of the input data. Our algorithm achieves an exponential speedup in $D$ compared to any known classical algorithm for this sampling task. In contrast to existing quantum machine learning algorithms, our algorithm circumvents sparsity and low-rank assumptions and thus has wide applicability. We also show that the sampled features can be combined with regression by stochastic gradient descent to achieve the learning without canceling out our exponential speedup. Our algorithm based on sampling optimized random features leads to an accelerated framework for machine learning that takes advantage of quantum computers.

فيزياء الكم التعلم الآلي التعلم الالي

Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features

188 - Shingo Yashima , Atsushi Nitanda , Taiji Suzuki 2019

Although kernel methods are widely used in many learning problems, they have poor scalability to large datasets. To address this problem, sketching and stochastic gradient methods are the most commonly used techniques to derive efficient large-scale learning algorithms. In this study, we consider solving a binary classification problem using random features and stochastic gradient descent. In recent research, an exponential convergence rate of the expected classification error under the strong low-noise condition has been shown. We extend these analyses to a random features setting, analyzing the error induced by the approximation of random features in terms of the distance between the generated hypothesis including population risk minimizers and empirical risk minimizers when using general Lipschitz loss functions, to show that an exponential convergence of the expected classification error is achieved even if random features approximation is applied. Additionally, we demonstrate that the convergence rate does not depend on the number of features and there is a significant computational benefit in using random features in classification problems because of the strong low-noise condition.

التعلم الالي التعلم الآلي

Power of data in quantum machine learning

141 - Hsin-Yuan Huang , Michael Broughton , Masoud Mohseni 2020

The use of quantum computing for machine learning is among the most exciting prospective applications of quantum technologies. However, machine learning tasks where data is provided can be considerably different than commonly studied computational ta sks. In this work, we show that some problems that are classically hard to compute can be easily predicted by classical machines learning from data. Using rigorous prediction error bounds as a foundation, we develop a methodology for assessing potential quantum advantage in learning tasks. The bounds are tight asymptotically and empirically predictive for a wide range of learning models. These constructions explain numerical results showing that with the help of data, classical machine learning models can be competitive with quantum models even if they are tailored to quantum problems. We then propose a projected quantum model that provides a simple and rigorous quantum speed-up for a learning problem in the fault-tolerant regime. For near-term implementations, we demonstrate a significant prediction advantage over some classical models on engineered data sets designed to demonstrate a maximal quantum advantage in one of the largest numerical tests for gate-based quantum machine learning to date, up to 30 qubits.

فيزياء الكم التعلم الآلي

Large-scale quantum machine learning

174 - Tobias Haug , Chris N. Self , M. S. Kim 2021

Quantum computers promise to enhance machine learning for practical applications. Quantum machine learning for real-world data has to handle extensive amounts of high-dimensional data. However, conventional methods for measuring quantum kernels are i mpractical for large datasets as they scale with the square of the dataset size. Here, we measure quantum kernels using randomized measurements to gain a quadratic speedup in computation time and quickly process large datasets. Further, we efficiently encode high-dimensional data into quantum computers with the number of features scaling linearly with the circuit depth. The encoding is characterized by the quantum Fisher information metric and is related to the radial basis function kernel. We demonstrate the advantages of our methods by classifying images with the IBM quantum computer. To achieve further speedups we distribute the quantum computational tasks between different quantum computers. Our approach is exceptionally robust to noise via a complementary error mitigation scheme. Using currently available quantum computers, the MNIST database can be processed within 220 hours instead of 10 years which opens up industrial applications of quantum machine learning.

فيزياء الكم التعلم الآلي التعلم الالي

Generalisation error in learning with random features and the hidden manifold model

117 - Federica Gerace , Bruno Loureiro , Florent Krzakala 2020

We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold m odel. We consider the high-dimensional regime and using the replica method from statistical physics, we provide a closed-form expression for the asymptotic generalisation performance in these problems, valid in both the under- and over-parametrised regimes and for a broad choice of generalised linear model loss functions. In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model. Beyond the interest in these particular problems, the theoretical formalism introduced in this manuscript provides a path to further extensions to more complex tasks.

نظرية الإحصاء التعلم الآلي الاحتمالات