No Arabic abstract
We provide the first mathematically complete derivation of the Nystrom method for low-rank approximation of indefinite kernels and propose an efficient method for finding an approximate eigendecomposition of such kernel matrices. Building on this result, we devise highly scalable methods for learning in reproducing kernel Kreu{i}n spaces. The devised approaches provide a principled and theoretically well-founded means to tackle large scale learning problems with indefinite kernels. The main motivation for our work comes from problems with structured representations (e.g., graphs, strings, time-series), where it is relatively easy to devise a pairwise (dis)similarity function based on intuition and/or knowledge of domain experts. Such functions are typically not positive definite and it is often well beyond the expertise of practitioners to verify this condition. The effectiveness of the devised approaches is evaluated empirically using indefinite kernels defined on structured and vectorial data representations.
In this paper we solve support vector machines in reproducing kernel Banach spaces with reproducing kernels defined on nonsymmetric domains instead of the traditional methods in reproducing kernel Hilbert spaces. Using the orthogonality of semi-inner-products, we can obtain the explicit representations of the dual (normalized-duality-mapping) elements of support vector machine solutions. In addition, we can introduce the reproduction property in a generalized native space by Fourier transform techniques such that it becomes a reproducing kernel Banach space, which can be even embedded into Sobolev spaces, and its reproducing kernel is set up by the related positive definite function. The representations of the optimal solutions of support vector machines (regularized empirical risks) in these reproducing kernel Banach spaces are formulated explicitly in terms of positive definite functions, and their finite numbers of coefficients can be computed by fixed point iteration. We also give some typical examples of reproducing kernel Banach spaces induced by Matern functions (Sobolev splines) so that their support vector machine solutions are well computable as the classical algorithms. Moreover, each of their reproducing bases includes information from multiple training data points. The concept of reproducing kernel Banach spaces offers us a new numerical tool for solving support vector machines.
We present necessary and sufficient conditions to hold true a Kramer type sampling theorem over semi-inner product reproducing kernel Banach spaces. Under some sampling-type hypotheses over a sequence of functions on these Banach spaces it results necessary that such sequence must be a $X_d$-Riesz basis and a sampling basis for the space. These results are a generalization of some already known sampling theorems over reproducing kernel Hilbert spaces.
This is a tutorial and survey paper on kernels, kernel methods, and related fields. We start with reviewing the history of kernels in functional analysis and machine learning. Then, Mercer kernel, Hilbert and Banach spaces, Reproducing Kernel Hilbert Space (RKHS), Mercers theorem and its proof, frequently used kernels, kernel construction from distance metric, important classes of kernels (including bounded, integrally positive definite, universal, stationary, and characteristic kernels), kernel centering and normalization, and eigenfunctions are explained in detail. Then, we introduce types of use of kernels in machine learning including kernel methods (such as kernel support vector machines), kernel learning by semi-definite programming, Hilbert-Schmidt independence criterion, maximum mean discrepancy, kernel mean embedding, and kernel dimensionality reduction. We also cover rank and factorization of kernel matrix as well as the approximation of eigenfunctions and kernels using the Nystr{o}m method. This paper can be useful for various fields of science including machine learning, dimensionality reduction, functional analysis in mathematics, and mathematical physics in quantum mechanics.
Gaussian processes (GPs) provide a gold standard for performance in online settings, such as sample-efficient control and black box optimization, where we need to update a posterior distribution as we acquire data in a sequential fashion. However, updating a GP posterior to accommodate even a single new observation after having observed $n$ points incurs at least $O(n)$ computations in the exact setting. We show how to use structured kernel interpolation to efficiently recycle computations for constant-time $O(1)$ online updates with respect to the number of points $n$, while retaining exact inference. We demonstrate the promise of our approach in a range of online regression and classification settings, Bayesian optimization, and active sampling to reduce error in malaria incidence forecasting. Code is available at https://github.com/wjmaddox/online_gp.
Let $G$ be a locally compact abelian group with a Haar measure, and $Y$ be a measure space. Suppose that $H$ is a reproducing kernel Hilbert space of functions on $Gtimes Y$, such that $H$ is naturally embedded into $L^2(Gtimes Y)$ and is invariant under the translations associated with the elements of $G$. Under some additional technical assumptions, we study the W*-algebra $mathcal{V}$ of translation-invariant bounded linear operators acting on $H$. First, we decompose $mathcal{V}$ into the direct integral of the W*-algebras of bounded operators acting on the reproducing kernel Hilbert spaces $widehat{H}_xi$, $xiinwidehat{G}$, generated by the Fourier transform of the reproducing kernel. Second, we give a constructive criterion for the commutativity of $mathcal{V}$. Third, in the commutative case, we construct a unitary operator that simultaneously diagonalizes all operators belonging to $mathcal{V}$, i.e., converts them into some multiplication operators. Our scheme generalizes many examples previously studied by Nikolai Vasilevski and other authors.