ﻻ يوجد ملخص باللغة العربية
This paper studies convergence of empirical risks in reproducing kernel Hilbert spaces (RKHS). A conventional assumption in the existing research is that empirical training data do not contain any noise but this may not be satisfied in some practical circumstances. Consequently the existing convergence results do not provide a guarantee as to whether empirical risks based on empirical data are reliable or not when the data contain some noise. In this paper, we fill out the gap in a few steps. First, we derive moderate sufficient conditions under which the expected risk changes stably (continuously) against small perturbation of the probability distribution of the underlying random variables and demonstrate how the cost function and kernel affect the stability. Second, we examine the difference between laws of the statistical estimators of the expected optimal loss based on pure data and contaminated data using Prokhorov metric and Kantorovich metric and derive some qualitative and quantitative statistical robustness results. Third, we identify appropriate metrics under which the statistical estimators are uniformly asymptotically consistent. These results provide theoretical grounding for analysing asymptotic convergence and examining reliability of the statistical estimators in a number of well-known machine learning models.
Spectral methods include a family of algorithms related to the eigenvectors of certain data-generated matrices. In this work, we are interested in studying the geometric landscape of the eigendecomposition problem in various spectral methods. In part
Empirical Centroid Fictitious Play (ECFP) is a generalization of the well-known Fictitious Play (FP) algorithm designed for implementation in large-scale games. In ECFP, the set of players is subdivided into equivalence classes with players in the sa
Many proposed methods for explaining machine learning predictions are in fact challenging to understand for nontechnical consumers. This paper builds upon an alternative consumer-driven approach called TED that asks for explanations to be provided in
We introduce a logical approach to formalizing statistical properties of machine learning. Specifically, we propose a formal model for statistical classification based on a Kripke model, and formalize various notions of classification performance, ro
In this paper, we develop an efficient sketchy empirical natural gradient method (SENG) for large-scale deep learning problems. The empirical Fisher information matrix is usually low-rank since the sampling is only practical on a small amount of data