ترغب بنشر مسار تعليمي؟ اضغط هنا

Bayesian Entropy Estimation for Countable Discrete Distributions

255   0   0.0 ( 0 )
 نشر من قبل Il Memming Park
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider the problem of estimating Shannons entropy $H$ from discrete data, in cases where the number of possible symbols is unknown or even countably infinite. The Pitman-Yor process, a generalization of Dirichlet process, provides a tractable prior distribution over the space of countably infinite discrete distributions, and has found major applications in Bayesian non-parametric statistics and machine learning. Here we show that it also provides a natural family of priors for Bayesian entropy estimation, due to the fact that moments of the induced posterior distribution over $H$ can be computed analytically. We derive formulas for the posterior mean (Bayes least squares estimate) and variance under Dirichlet and Pitman-Yor process priors. Moreover, we show that a fixed Dirichlet or Pitman-Yor process prior implies a narrow prior distribution over $H$, meaning the prior strongly determines the entropy estimate in the under-sampled regime. We derive a family of continuous mixing measures such that the resulting mixture of Pitman-Yor processes produces an approximately flat prior over $H$. We show that the resulting Pitman-Yor Mixture (PYM) entropy estimator is consistent for a large class of distributions. We explore the theoretical properties of the resulting estimator, and show that it performs well both in simulation and in application to real data.



قيم البحث

اقرأ أيضاً

67 - Jisheng Dai , Xu Bao , Weichao Xu 2016
The performance of the existing sparse Bayesian learning (SBL) methods for off-gird DOA estimation is dependent on the trade off between the accuracy and the computational workload. To speed up the off-grid SBL method while remain a reasonable accura cy, this letter describes a computationally efficient root SBL method for off-grid DOA estimation, where a coarse refinable grid, whose sampled locations are viewed as the adjustable parameters, is adopted. We utilize an expectation-maximization (EM) algorithm to iteratively refine this coarse grid, and illustrate that each updated grid point can be simply achieved by the root of a certain polynomial. Simulation results demonstrate that the computational complexity is significantly reduced and the modeling error can be almost eliminated.
82 - Michael B. Baer 2006
In prefix coding over an infinite alphabet, methods that consider specific distributions generally consider those that decline more quickly than a power law (e.g., Golomb coding). Particular power-law distributions, however, model many random variabl es encountered in practice. For such random variables, compression performance is judged via estimates of expected bits per input symbol. This correspondence introduces a family of prefix codes with an eye towards near-optimal coding of known distributions. Compression performance is precisely estimated for well-known probability distributions using these codes and using previously known prefix codes. One application of these near-optimal codes is an improved representation of rational numbers.
This paper proposes an off-grid channel estimation scheme for orthogonal time-frequency space (OTFS) systems adopting the sparse Bayesian learning (SBL) framework. To avoid channel spreading caused by the fractional delay and Doppler shifts and to fu lly exploit the channel sparsity in the delay-Doppler (DD) domain, we estimate the original DD domain channel response rather than the effective DD domain channel response as commonly adopted in the literature. OTFS channel estimation is first formulated as a one-dimensional (1D) off-grid sparse signal recovery (SSR) problem based on a virtual sampling grid defined in the DD space, where the on-grid and off-grid components of the delay and Doppler shifts are separated for estimation. In particular, the on-grid components of the delay and Doppler shifts are jointly determined by the entry indices with significant values in the recovered sparse vector. Then, the corresponding off-grid components are modeled as hyper-parameters in the proposed SBL framework, which can be estimated via the expectation-maximization method. To strike a balance between channel estimation performance and computational complexity, we further propose a two-dimensional (2D) off-grid SSR problem via decoupling the delay and Doppler shift estimations. In our developed 1D and 2D off-grid SBL-based channel estimation algorithms, the hyper-parameters are updated alternatively for computing the conditional posterior distribution of channels, which can be exploited to reconstruct the effective DD domain channel. Compared with the 1D method, the proposed 2D method enjoys a much lower computational complexity while only suffers slight performance degradation. Simulation results verify the superior performance of the proposed channel estimation schemes over state-of-the-art schemes.
63 - Igal Sason , Sergio Verdu 2017
This paper gives upper and lower bounds on the minimum error probability of Bayesian $M$-ary hypothesis testing in terms of the Arimoto-Renyi conditional entropy of an arbitrary order $alpha$. The improved tightness of these bounds over their specializ
This paper deals with the state estimation problem in discrete-event systems modeled with nondeterministic finite automata, partially observed via a sensor measuring unit whose measurements (reported observations) may be vitiated by a malicious attac ker. The attacks considered in this paper include arbitrary deletions, insertions, or substitutions of observed symbols by taking into account a bounded number of attacks or, more generally, a total cost constraint (assuming that each deletion, insertion, or substitution bears a positive cost to the attacker). An efficient approach is proposed to describe possible sequences of observations that match the one received by the measuring unit, as well as their corresponding state estimates and associated total costs. We develop an algorithm to obtain the least-cost matching sequences by reconstructing only a finite number of possible sequences, which we subsequently use to efficiently perform state estimation. We also develop a technique for verifying tamper-tolerant diagnosability under attacks that involve a bounded number of deletions, insertions, and substitutions (or, more generally, under attacks of bounded total cost) by using a novel structure obtained by attaching attacks and costs to the original plant. The overall construction and verification procedure have complexity that is of O(|X|^2 C^2),where |X| is the number of states of the given finite automaton and C is the maximum total cost that is allowed for all the deletions, insertions, and substitutions. We determine the minimum value of C such that the attacker can coordinate its tampering action to keep the observer indefinitely confused while utilizing a finite number of attacks. Several examples are presented to demonstrate the proposed methods.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا