An Information-theoretic Approach to Distribution Shifts

422 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Marco Federici

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Marco Federici - Ryota Tomioka - Patrick Forre

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Safely deploying machine learning models to the real world is often a challenging process. Models trained with data obtained from a specific geographic location tend to fail when queried with data obtained elsewhere, agents trained in a simulation can struggle to adapt when deployed in the real world or novel environments, and neural networks that are fit to a subset of the population might carry some selection bias into their decision process. In this work, we describe the problem of data shift from a novel information-theoretic perspective by (i) identifying and describing the different sources of error, (ii) comparing some of the most promising objectives explored in the recent domain generalization, and fair classification literature. From our theoretical analysis and empirical evaluation, we conclude that the model selection procedure needs to be guided by careful considerations regarding the observed data, the factors used for correction, and the structure of the data-generating process.

قيم البحث

63 - James P. Bagrow , Erik M. Bollt 2018

As network research becomes more sophisticated, it is more common than ever for researchers to find themselves not studying a single network but needing to analyze sets of networks. An important task when working with sets of networks is network comp arison, developing a similarity or distance measure between networks so that meaningful comparisons can be drawn. The best means to accomplish this task remains an open area of research. Here we introduce a new measure to compare networks, the Network Portrait Divergence, that is mathematically principled, incorporates the topological characteristics of networks at all structural scales, and is general-purpose and applicable to all types of networks. An important feature of our measure that enables many of its useful properties is that it is based on a graph invariant, the network portrait. We test our measure on both synthetic graphs and real world networks taken from protein interaction data, neuroscience, and computational social science applications. The Network Portrait Divergence reveals important characteristics of multilayer and temporal networks extracted from data.

الشبكات الاجتماعية والمعلومات نظرية المعلومات نظرية المعلومات

Information Theoretic Optimal Learning of Gaussian Graphical Models

186 - Sidhant Misra , Marc Vuffray , Andrey Y. Lokhov 2017

What is the optimal number of independent observations from which a sparse Gaussian Graphical Model can be correctly recovered? Information-theoretic arguments provide a lower bound on the minimum number of samples necessary to perfectly identify the support of any multivariate normal distribution as a function of model parameters. For a model defined on a sparse graph with $p$ nodes, a maximum degree $d$ and minimum normalized edge strength $kappa$, this necessary number of samples scales at least as $d log p/kappa^2$. The sample complexity requirements of existing methods for perfect graph reconstruction exhibit dependency on additional parameters that do not enter in the lower bound. The question of whether the lower bound is tight and achievable by a polynomial time algorithm remains open. In this paper, we constructively answer this question and propose an algorithm, termed DICE, whose sample complexity matches the information-theoretic lower bound up to a universal constant factor. We also propose a related algorithm SLICE that has a slightly higher sample complexity, but can be implemented as a mixed integer quadratic program which makes it attractive in practice. Importantly, SLICE retains a critical advantage of DICE in that its sample complexity only depends on quantities present in the information theoretic lower bound. We anticipate that this result will stimulate future search of computationally efficient sample-optimal algorithms.

التعلم الآلي نظرية المعلومات نظرية المعلومات

An Information-Theoretic Approach to Network Modularity

96 - Etay Ziv 2004

Exploiting recent developments in information theory, we propose, illustrate, and validate a principled information-theoretic algorithm for module discovery and resulting measure of network modularity. This measure is an order parameter (a dimensionl ess number between 0 and 1). Comparison is made to other approaches to module-discovery and to quantifying network modularity using Monte Carlo generated Erdos-like modular networks. Finally, the Network Information Bottleneck (NIB) algorithm is applied to a number of real world networks, including the social network of coauthors at the APS March Meeting 2004.

الأساليب الكمية الجينوم الشبكات الجزيئية

An Information Theoretic Algorithm for Finding Periodicities in Stellar Light Curves

354 - Pablo Huijse , Pablo A. Estevez , Pavlos Protopapas 2012

We propose a new information theoretic metric for finding periodicities in stellar light curves. Light curves are astronomical time series of brightness over time, and are characterized as being noisy and unevenly sampled. The proposed metric combine s correntropy (generalized correlation) with a periodic kernel to measure similarity among samples separated by a given period. The new metric provides a periodogram, called Correntropy Kernelized Periodogram (CKP), whose peaks are associated with the fundamental frequencies present in the data. The CKP does not require any resampling, slotting or folding scheme as it is computed directly from the available samples. CKP is the main part of a fully-automated pipeline for periodic light curve discrimination to be used in astronomical survey databases. We show that the CKP method outperformed the slotted correntropy, and conventional methods used in astronomy for periodicity discrimination and period estimation tasks, using a set of light curves drawn from the MACHO survey. The proposed metric achieved 97.2% of true positives with 0% of false positives at the confidence level of 99% for the periodicity discrimination task; and 88% of hits with 11.6% of multiples and 0.4% of misses in the period estimation task.

الأجهزة والأساليب للزيئات الفيزياء الفلكية نظرية المعلومات نظرية المعلومات

A Generalized Information-Theoretic Approach for Bounding the Number of Independent Sets in Bipartite Graphs

64 - Igal Sason 2020

This paper studies the problem of upper bounding the number of independent sets in a graph, expressed in terms of its degree distribution. For bipartite regular graphs, Kahn (2001) established a tight upper bound using an information-theoretic approa ch, and he also conjectured an upper bound for general graphs. His conjectured bound was recently proved by Sah et al. (2019), using different techniques not involving information theory. The main contribution of this work is the extension of Kahns information-theoretic proof technique to handle irregular bipartite graphs. In particular, when the bipartite graph is regular on one side, but it may be irregular in the other, the extended entropy-based proof technique yields the same bound that was conjectured by Kahn (2001) and proved by Sah et al. (2019).

التوافقية نظرية المعلومات نظرية المعلومات