No Arabic abstract
Complex systems are increasingly being viewed as distributed information processing systems, particularly in the domains of computational neuroscience, bioinformatics and Artificial Life. This trend has resulted in a strong uptake in the use of (Shannon) information-theoretic measures to analyse the dynamics of complex systems in these fields. We introduce the Java Information Dynamics Toolkit (JIDT): a Google code project which provides a standalone, (GNU GPL v3 licensed) open-source code implementation for empirical estimation of information-theoretic measures from time-series data. While the toolkit provides classic information-theoretic measures (e.g. entropy, mutual information, conditional mutual information), it ultimately focusses on implementing higher-level measures for information dynamics. That is, JIDT focusses on quantifying information storage, transfer and modification, and the dynamics of these operations in space and time. For this purpose, it includes implementations of the transfer entropy and active information storage, their multivariate extensions and local or pointwise variants. JIDT provides implementations for both discrete and continuous-valued data for each measure, including various types of estimator for continuous data (e.g. Gaussian, box-kernel and Kraskov-Stoegbauer-Grassberger) which can be swapped at run-time due to Javas object-oriented polymorphism. Furthermore, while written in Java, the toolkit can be used directly in MATLAB, GNU Octave, Python and other environments. We present the principles behind the code design, and provide several examples to guide users.
The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory. IDTxl provides functionality to estimate the following measures: 1) For network inference: multivariate transfer entropy (TE)/Granger causality (GC), multivariate mutual information (MI), bivariate TE/GC, bivariate MI 2) For analysis of node dynamics: active information storage (AIS), partial information decomposition (PID) IDTxl implements estimators for discrete and continuous data with parallel computing engines for both GPU and CPU platforms. Written for Python3.4.3+.
Given a probability measure $mu$ over ${mathbb R}^n$, it is often useful to approximate it by the convex combination of a small number of probability measures, such that each component is close to a product measure. Recently, Ronen Eldan used a stochastic localization argument to prove a general decomposition result of this type. In Eldans theorem, the `number of components is characterized by the entropy of the mixture, and `closeness to product is characterized by the covariance matrix of each component. We present an elementary proof of Eldans theorem which makes use of an information theory (or estimation theory) interpretation. The proof is analogous to the one of an earlier decomposition result known as the `pinning lemma.
A key practical constraint on the design of Hybrid automatic repeat request (HARQ) schemes is the size of the on-chip buffer that is available at the receiver to store previously received packets. In fact, in modern wireless standards such as LTE and LTE-A, the HARQ buffer size is one of the main drivers of the modem area and power consumption. This has recently highlighted the importance of HARQ buffer management, that is, of the use of buffer-aware transmission schemes and of advanced compression policies for the storage of received data. This work investigates HARQ buffer management by leveraging information-theoretic achievability arguments based on random coding. Specifically, standard HARQ schemes, namely Type-I, Chase Combining and Incremental Redundancy, are first studied under the assumption of a finite-capacity HARQ buffer by considering both coded modulation, via Gaussian signaling, and Bit Interleaved Coded Modulation (BICM). The analysis sheds light on the impact of different compression strategies, namely the conventional compression log-likelihood ratios and the direct digitization of baseband signals, on the throughput. Then, coding strategies based on layered modulation and optimized coding blocklength are investigated, highlighting the benefits of HARQ buffer-aware transmission schemes. The optimization of baseband compression for multiple-antenna links is also studied, demonstrating the optimality of a transform coding approach.
A basic information theoretic model for summarization is formulated. Here summarization is considered as the process of taking a report of $v$ binary objects, and producing from it a $j$ element subset that captures most of the important features of the original report, with importance being defined via an arbitrary set function endemic to the model. The loss of information is then measured by a weight average of variational distances, which we term the semantic loss. Our results include both cases where the probability distribution generating the $v$-length reports are known and unknown. In the case where it is known, our results demonstrate how to construct summarizers which minimize the semantic loss. For the case where the probability distribution is unknown, we show how to construct summarizers whose semantic loss when averaged uniformly over all possible distribution converges to the minimum.
We consider a slotted wireless network in an infrastructure setup with a base station (or an access point) and N users. The wireless channel gain between the base station and the users is assumed to be i.i.d., and the base station seeks to schedule the user with the highest channel gain in every slot (opportunistic scheduling). We assume that the identity of the user with the highest channel gain is resolved using a series of contention slots and with feedback from the base station. In this setup, we formulate the contention resolution problem for opportunistic scheduling as identifying a random threshold (channel gain) that separates the best channel from the other samples. We show that the average delay to resolve contention is related to the entropy of the random threshold. We illustrate our formulation by studying the opportunistic splitting algorithm (OSA) for i.i.d. wireless channel [9]. We note that the thresholds of OSA correspond to a maximal probability allocation scheme. We conjecture that maximal probability allocation is an entropy minimizing strategy and a delay minimizing strategy for i.i.d. wireless channel. Finally, we discuss the applicability of this framework for few other network scenarios.