ﻻ يوجد ملخص باللغة العربية
Measuring the dependence of data plays a central role in statistics and machine learning. In this work, we summarize and generalize the main idea of existing information-theoretic dependence measures into a higher-level perspective by the Shearers inequality. Based on our generalization, we then propose two measures, namely the matrix-based normalized total correlation ($T_alpha^*$) and the matrix-based normalized dual total correlation ($D_alpha^*$), to quantify the dependence of multiple variables in arbitrary dimensional space, without explicit estimation of the underlying data distributions. We show that our measures are differentiable and statistically more powerful than prevalent ones. We also show the impact of our measures in four different machine learning problems, namely the gene regulatory network inference, the robust machine learning under covariate shift and non-Gaussian noises, the subspace outlier detection, and the understanding of the learning dynamics of convolutional neural networks (CNNs), to demonstrate their utilities, advantages, as well as implications to those problems. Code of our dependence measure is available at: https://bit.ly/AAAI-dependence
We introduce the matrix-based Renyis $alpha$-order entropy functional to parameterize Tishby et al. information bottleneck (IB) principle with a neural network. We term our methodology Deep Deterministic Information Bottleneck (DIB), as it avoids var
The degree-based entropy of a graph is defined as the Shannon entropy based on the information functional that associates the vertices of the graph with the corresponding degrees. In this paper, we study extremal problems of finding the graphs attain
Two intimately related new classes of games are introduced and studied: entropy games (EGs) and matrix multiplication games (MMGs). An EG is played on a finite arena by two-and-a-half players: Despot, Tribune and the non-deterministic People. Despot
Feature selection, in the context of machine learning, is the process of separating the highly predictive feature from those that might be irrelevant or redundant. Information theory has been recognized as a useful concept for this task, as the predi
We study image inverse problems with a normalizing flow prior. Our formulation views the solution as the maximum a posteriori estimate of the image conditioned on the measurements. This formulation allows us to use noise models with arbitrary depende