Algorithmic Statistics

92 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Paul Vitanyi

تاريخ النشر 2000

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Peter Gacs - Paul Vitanyi (CWIn

نظرية الإحصاء نظرية المعلومات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on two-part codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the model-to-data code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes--in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the ``Kolmogorov structure function and ``absolutely non-stochastic objects--those rare objects for which the simplest models that summarize their relevant information (minimal sufficient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones.

قيم البحث

58 - Ashwin Pananjady , Richard J. Samworth 2020

Motivated by models for multiway comparison data, we consider the problem of estimating a coordinate-wise isotonic function on the domain $[0, 1]^d$ from noisy observations collected on a uniform lattice, but where the design points have been permute d along each dimension. While the univariate and bivariat

نظرية الإحصاء نظرية المعلومات التعلم الآلي

Ricci curvature for parametric statistics via optimal transport

72 - Wuchen Li , Guido Montufar 2018

We elaborate the notion of a Ricci curvature lower bound for parametrized statistical models. Following the seminal ideas of Lott-Strum-Villani, we define this notion based on the geodesic convexity of the Kullback-Leibler divergence in a Wasserstein statistical manifold, that is, a manifold of probability distributions endowed with a Wasserstein metric tensor structure. Within these definitions, the Ricci curvature is related to both, information geometry and Wasserstein geometry. These definitions allow us to formulate bounds on the convergence rate of Wasserstein gradient flows and information functional inequalities in parameter space. We discuss examples of Ricci curvature lower bounds and convergence rates in exponential family models.

نظرية الإحصاء نظرية المعلومات نظرية المعلومات

Estimation of Monge Matrices

270 - Jan-Christian Hutter , Cheng Mao , Philippe Rigollet 2019

Monge matrices and their permut

نظرية الإحصاء نظرية المعلومات التعلم الآلي

Some compact notations for concentration inequalities and user-friendly results

68 - Kaizheng Wang 2019

This paper presents compact notations for concentration inequalities and convenient results to streamline probabilistic analysis. The new expressions describe the typical sizes and tails of random variables, allowing for simple operations without hea vy use of inessential constants. They bridge classical asymptotic notations and modern non-asymptotic tail bounds together. Examples of different kinds demonstrate their efficacy.

نظرية الإحصاء نظرية المعلومات التعلم الآلي

Maximum sampled conditional likelihood for informative subsampling

93 - HaiYing Wang , Jae Kwang Kim 2020

Subsampling is a computationally effective approach to extract information from massive data sets when computing resources are limited. After a subsample is taken from the full data, most available methods use an inverse probability weighted objectiv e function to estimate the model parameters. This type of weighted estimator does not fully utilize information in the selected subsample. In this paper, we propose to use the maximum sampled conditional likelihood estimator (MSCLE) based on the sampled data. We established the asymptotic normality of the MSCLE and prove that its asymptotic variance covariance matrix is the smallest among a class of asymptotically unbiased estimators, including the inverse probability weighted estimator. We further discuss the asymptotic results with the L-optimal subsampling probabilities and illustrate the estimation procedure with generalized linear models. Numerical experiments are provided to evaluate the practical performance of the proposed method.

نظرية الإحصاء نظرية المعلومات التعلم الآلي