ﻻ يوجد ملخص باللغة العربية
Estimating mutual information is an important statistics and machine learning problem. To estimate the mutual information from data, a common practice is preparing a set of paired samples ${(mathbf{x}_i,mathbf{y}_i)}_{i=1}^n stackrel{mathrm{i.i.d.}}{sim} p(mathbf{x},mathbf{y})$. However, in many situations, it is difficult to obtain a large number of data pairs. To address this problem, we propose the semi-supervised Squared-loss Mutual Information (SMI) estimation method using a small number of paired samples and the available unpaired ones. We first represent SMI through the density ratio function, where the expectation is approximated by the samples from marginals and its assignment parameters. The objective is formulated using the optimal transport problem and quadratic programming. Then, we introduce the Least-Squares Mutual Information with Sinkhorn (LSMI-Sinkhorn) algorithm for efficient optimization. Through experiments, we first demonstrate that the proposed method can estimate the SMI without a large number of paired samples. Then, we show the effectiveness of the proposed LSMI-Sinkhorn algorithm on various types of machine learning problems such as image matching and photo album summarization. Code can be found at https://github.com/csyanbin/LSMI-Sinkhorn.
Mutual Information (MI) plays an important role in representation learning. However, MI is unfortunately intractable in continuous and high-dimensional settings. Recent advances establish tractable and scalable MI estimators to discover useful repres
Semi-supervised active clustering (SSAC) utilizes the knowledge of a domain expert to cluster data points by interactively making pairwise same-cluster queries. However, it is impractical to ask human oracles to answer every pairwise query. In this p
We introduce a formulation of optimal transport problem for distributions on function spaces, where the stochastic map between functional domains can be partially represented in terms of an (infinite-dimensional) Hilbert-Schmidt operator mapping a Hi
This article investigates the quality of the estimator of the linear Monge mapping between distributions. We provide the first concentration result on the linear mapping operator and prove a sample complexity of $n^{-1/2}$ when using empirical estima
In this paper we present a method for learning a discriminative classifier from unlabeled or partially labeled data. Our approach is based on an objective function that trades-off mutual information between observed examples and their predicted categ