ﻻ يوجد ملخص باللغة العربية
This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL was originally based on data compression ideas, this introduction can be read without any knowledge thereof. It takes into account all major developments since 2007, the last time an extensive overview was written. These include new methods for model selection and averaging and hypothesis testing, as well as the first completely general definition of {em MDL estimators}. Incorporating these developments, MDL can be seen as a powerful extension of both penalized likelihood and Bayesian approaches, in which penalization functions and prior distributions are replaced by more general luckiness functions, average-case methodology is replaced by a more robust worst-case approach, and in which methods classically viewed as highly distinct, such as AIC vs BIC and cross-validation vs Bayes can, to a large extent, be viewed from a unified perspective.
We discuss work extraction from classical information engines (e.g., Szilard) with $N$-particles, $q$ partitions, and initial arbitrary non-equilibrium states. In particular, we focus on their {em optimal} behaviour, which includes the measurement of
Modern statistical modeling is an important complement to the more traditional approach of physics where Complex Systems are studied by means of extremely simple idealized models. The Minimum Description Length (MDL) is a principled approach to stati
The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length prin
Fast and effective unsupervised anomaly detection algorithms have been proposed for categorical data based on the minimum description length (MDL) principle. However, they can be ineffective when detecting anomalies in heterogeneous datasets represen
A path or a polygonal domain is C-oriented if the orientations of its edges belong to a set of C given orientations; this is a generalization of the notable rectilinear case (C = 2). We study exact and approximation algorithms for minimum-link C-orie