Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A Consistent Extension of Discrete Optimal Transport Maps for Machine Learning Applications

120 0 0.0 ( 0 )

Download Cite

Added by Alberto Gonzalez-Sanz

Publication date 2021

fields Mathematical Statistics

and research's language is English

Authors Lucas de Lara

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Optimal transport maps define a one-to-one correspondence between probability distributions, and as such have grown popular for machine learning applications. However, these maps are generally defined on empirical observations and cannot be generalized to new samples while preserving asymptotic properties. We extend a novel method to learn a consistent estimator of a continuous optimal transport map from two empirical distributions. The consequences of this work are two-fold: first, it enables to extend the transport plan to new observations without computing again the discrete optimal transport map; second, it provides statistical guarantees to machine learning applications of optimal transport. We illustrate the strength of this approach by deriving a consistent framework for transport-based counterfactual explanations in fairness.

rate research

A Survey on Optimal Transport for Machine Learning: Theory and Applications

75 - Luis Caicedo Torres , Luiz Manella Pereira , M. Hadi Amini 2021

Optimal Transport (OT) theory has seen an increasing amount of attention from the computer science community due to its potency and relevance in modeling and machine learning. It introduces means that serve as powerful ways to compare probability distributions with each other, as well as producing optimal mappings to minimize cost functions. In this survey, we present a brief introduction and history, a survey of previous work and propose directions of future study. We will begin by looking at the history of optimal transport and introducing the founders of this field. We then give a brief glance into the algorithms related to OT. Then, we will follow up with a mathematical formulation and the prerequisites to understand OT. These include Kantorovich duality, entropic regularization, KL Divergence, and Wassertein barycenters. Since OT is a computationally expensive problem, we then introduce the entropy-regularized version of computing optimal mappings, which allowed OT problems to become applicable in a wide range of machine learning problems. In fact, the methods generated from OT theory are competitive with the current state-of-the-art methods. We follow this up by breaking down research papers that focus on image processing, graph learning, neural architecture search, document representation, and domain adaptation. We close the paper with a small section on future research. Of the recommendations presented, three main problems are fundamental to allow OT to become widely applicable but rely strongly on its mathematical formulation and thus are hardest to answer. Since OT is a novel method, there is plenty of space for new research, and with more and more competitive methods (either on an accuracy level or computational speed level) being created, the future of applied optimal transport is bright as it has become pervasive in machine learning.

Machine Learning

Consistent Maximum Likelihood Estimation Using Subsets with Applications to Multivariate Mixed Models

292 - Karl Oskar Ekvall , Galin L. Jones 2018

We present new results for consistency of maximum likelihood estimators with a focus on multivariate mixed models. Our theory builds on the idea of using subsets of the full data to establish consistency of estimators based on the full data. It requires neither that the data consist of independent observations, nor that the observations can be modeled as a stationary stochastic process. Compared to existing asymptotic theory using the idea of subsets we substantially weaken the assumptions, bringing them closer to what suffices in classical settings. We apply our theory in two multivariate mixed models for which it was unknown whether maximum likelihood estimators are consistent. The models we consider have non-stochastic predictors and multivariate responses which are possibly mixed-type (some discrete and some continuous).

Statistics Theory Statistics Theory

Hoeffdings lemma for Markov Chains and its applications to statistical learning

118 - Jianqing Fan , Bai Jiang , Qiang Sun 2018

We extend Hoeffdings lemma to general-state-space and not necessarily reversible Markov chains. Let ${X_i}_{i ge 1}$ be a stationary Markov chain with invariant measure $pi$ and absolute spectral gap $1-lambda$, where $lambda$ is defined as the operator norm of the transition kernel acting on mean zero and square-integrable functions with respect to $pi$. Then, for any bounded functions $f_i: x mapsto [a_i,b_i]$, the sum of $f_i(X_i)$ is sub-Gaussian with variance proxy $frac{1+lambda}{1-lambda} cdot sum_i frac{(b_i-a_i)^2}{4}$. This result differs from the classical Hoeffdings lemma by a multiplicative coefficient of $(1+lambda)/(1-lambda)$, and simplifies to the latter when $lambda = 0$. The counterpart of Hoeffdings inequality for Markov chains immediately follows. Our results assume none of countable state space, reversibility and time-homogeneity of Markov chains and cover time-dependent functions with various ranges. We illustrate the utility of these results by applying them to six problems in statistics and machine learning.

Statistics Theory Statistics Theory

An extension of Azzalinis method

68 - Filippo Domma , Bov{z}idar V. Popovic , Saralees Nadarajah 2018

The aim of this paper is to extend Azzalinis method. This extension is done in two stages: consider two dependent and non-identically distributed random variables say $X_1$ and $X_2$; model the dependence between $X_1$ and $X_2$ by a copula. To illustrate the new method, we assume $X_1$ and $X_2$ are exponential random variables. This assumption leads to a new distribution called the Generalized Weighted Exponential Distribution (GWED), a generalization of Gupta and Kundu (2009)s Weighted Exponential Distribution (WED). Some mathematical properties of the GWED are derived, and its parameters estimated by maximum likelihood. The GWED is applied to biochemical data sets showing its good performance compared to the WED.

Statistics Theory Statistics Theory

Consistent Variable Selection for Functional Regression Models

567 - Julian A. A. Collazos 2015

The dual problem of testing the predictive significance of a particular covariate, and identification of the set of relevant covariates is common in applied research and methodological investigations. To study this problem in the context of functional linear regression models with predictor variables observed over a grid and a scalar response, we consider basis expansions of the functional covariates and apply the likelihood ratio test. Based on p-values from testing each predictor, we propose a new variable selection method, which is consistent in selecting the relevant predictors from set of available predictors that is allowed to grow with the sample size n. Numerical simulations suggest that the proposed variable selection procedure outperforms existing methods found in the literature. A real dataset from weather stations in Japan is analyzed.

Statistics Theory Statistics Theory

comments

Fetching comments

Higher Institute for Applied Sciences and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Consistent Extension of Discrete Optimal Transport Maps for Machine Learning Applications

Ask ChatGPT about the research

No Arabic abstract

Read More