ﻻ يوجد ملخص باللغة العربية
In this paper we apply different techniques of information distortion on a set of classical books written in English. We study the impact that these distortions have upon the Kolmogorov complexity and the clustering by compression technique (the latter based on Normalized Compression Distance, NCD). We show how to decrease the complexity of the considered books introducing several modifications in them. We measure how the information contained in each book is maintained using a clustering error measure. We find experimentally that the best way to keep the clustering error is by means of modifications in the most frequent words. We explain the details of these information distortions and we compare with other kinds of modifications like random word distortions and unfrequent word distortions. Finally, some phenomenological explanations from the different empirical results that have been carried out are presented.
While Kolmogorov complexity is the accepted absolute measure of information content in an individual finite object, a similarly absolute notion is needed for the information distance between two individual objects, for example, two pictures. We give
The normalized information distance is a universal distance measure for objects of all kinds. It is based on Kolmogorov complexity and thus uncomputable, but there are ways to utilize it. First, compression algorithms can be used to approximate the K
The objective of this paper is to further investigate various applications of information Nonanticipative Rate Distortion Function (NRDF) by discussing two working examples, the Binary Symmetric Markov Source with parameter $p$ (BSMS($p$)) with Hammi
A rate-distortion problem motivated by the consideration of semantic information is formulated and solved. The starting point is to model an information source as a pair consisting of an intrinsic state which is not observable, corresponding to the s
According to Kolmogorov complexity, every finite binary string is compressible to a shortest code -- its information content -- from which it is effectively recoverable. We investigate the extent to which this holds for infinite binary sequences (str