No Arabic abstract
Complexity measures in the context of the Integrated Information Theory of consciousness try to quantify the strength of the causal connections between different neurons. This is done by minimizing the KL-divergence between a full system and one without causal connections. Various measures have been proposed and compared in this setting. We will discuss a class of information geometric measures that aim at assessing the intrinsic causal influences in a system. One promising candidate of these measures, denoted by $Phi_{CIS}$, is based on conditional independence statements and does satisfy all of the properties that have been postulated as desirable. Unfortunately it does not have a graphical representation which makes it less intuitive and difficult to analyze. We propose an alternative approach using a latent variable which models a common exterior influence. This leads to a measure $Phi_{CII}$, Causal Information Integration, that satisfies all of the required conditions. Our measure can be calculated using an iterative information geometric algorithm, the em-algorithm. Therefore we are able to compare its behavior to existing integrated information measures.
How many bits of information are revealed by a learning algorithm for a concept class of VC-dimension $d$? Previous works have shown that even for $d=1$ the amount of information may be unbounded (tend to $infty$ with the universe size). Can it be that all concepts in the class require leaking a large amount of information? We show that typically concepts do not require leakage. There exists a proper learning algorithm that reveals $O(d)$ bits of information for most concepts in the class. This result is a special case of a more general phenomenon we explore. If there is a low information learner when the algorithm {em knows} the underlying distribution on inputs, then there is a learner that reveals little information on an average concept {em without knowing} the distribution on inputs.
We propose a new estimator to measure directed dependencies in time series. The dimensionality of data is first reduced using a new non-uniform embedding technique, where the variables are ranked according to a weighted sum of the amount of new information and improvement of the prediction accuracy provided by the variables. Then, using a greedy approach, the most informative subsets are selected in an iterative way. The algorithm terminates, when the highest ranked variable is not able to significantly improve the accuracy of the prediction as compared to that obtained using the existing selected subsets. In a simulation study, we compare our estimator to existing state-of-the-art methods at different data lengths and directed dependencies strengths. It is demonstrated that the proposed estimator has a significantly higher accuracy than that of existing methods, especially for the difficult case, where the data is highly correlated and coupled. Moreover, we show its false detection of directed dependencies due to instantaneous couplings effect is lower than that of existing measures. We also show applicability of the proposed estimator on real intracranial electroencephalography data.
Three decades of research in communication complexity have led to the invention of a number of techniques to lower bound randomized communication complexity. The majority of these techniques involve properties of large submatrices (rectangles) of the truth-table matrix defining a communication problem. The only technique that does not quite fit is information complexity, which has been investigated over the last decade. Here, we connect information complexity to one of the most powerful rectangular techniques: the recently-introduced smooth corruption (or smooth rectangle) bound. We show that the former subsumes the latter under rectangular input distributions. We conjecture that this subsumption holds more generally, under arbitrary distributions, which would resolve the long-standing direct sum question for randomized communication. As an application, we obtain an optimal $Omega(n)$ lower bound on the information complexity---under the {em uniform distribution}---of the so-called orthogonality problem (ORT), which is in turn closely related to the much-studied Gap-Hamming-Distance (GHD). The proof of this bound is along the lines of recent communication lower bounds for GHD, but we encounter a surprising amount of additional technical detail.
We provide a negative resolution to a conjecture of Steinke and Zakynthinou (2020a), by showing that their bound on the conditional mutual information (CMI) of proper learners of Vapnik--Chervonenkis (VC) classes cannot be improved from $d log n +2$ to $O(d)$, where $n$ is the number of i.i.d. training examples. In fact, we exhibit VC classes for which the CMI of any proper learner cannot be bounded by any real-valued function of the VC dimension only.
For a known weak signal in additive white noise, the asymptotic performance of a locally optimum processor (LOP) is shown to be given by the Fisher information (FI) of a standardized even probability density function (PDF) of noise in three cases: (i) the maximum signal-to-noise ratio (SNR) gain for a periodic signal; (ii) the optimal asymptotic relative efficiency (ARE) for signal detection; (iii) the best cross-correlation gain (CG) for signal transmission. The minimal FI is unity, corresponding to a Gaussian PDF, whereas the FI is certainly larger than unity for any non-Gaussian PDFs. In the sense of a realizable LOP, it is found that the dichotomous noise PDF possesses an infinite FI for known weak signals perfectly processed by the corresponding LOP. The significance of FI lies in that it provides a upper bound for the performance of locally optimum processing.