ترغب بنشر مسار تعليمي؟ اضغط هنا

350 - Tim van Erven 2014
When I first encountered PAC-Bayesian concentration inequalities they seemed to me to be rather disconnected from good old-fashioned results like Hoeffdings and Bernsteins inequalities. But, at least for one flavour of the PAC-Bayesian bounds, there is actually a very close relation, and the main innovation is a continuous version of the union bound, along with some ingenious applications. Heres the gist of whats going on, presented from a machine learning perspective.
Renyi divergence is related to Renyi entropy much like Kullback-Leibler divergence is related to Shannons entropy, and comes up in many settings. It was introduced by Renyi as a measure of information that satisfies almost the same axioms as Kullback -Leibler divergence, and depends on a parameter that is called its order. In particular, the Renyi divergence of order 1 equals the Kullback-Leibler divergence. We review and extend the most important properties of Renyi divergence and Kullback-Leibler divergence, including convexity, continuity, limits of $sigma$-algebras and the relation of the special order 0 to the Gaussian dichotomy and contiguity. We also show how to generalize the Pythagorean inequality to orders different from 1, and we extend the known equivalence between channel capacity and minimax redundancy to continuous channel inputs (for all orders) and present several other minimax results.
Most methods for decision-theoretic online learning are based on the Hedge algorithm, which takes a parameter called the learning rate. In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others. We propose a new way of setting the learning rate, which adapts to the difficulty of the learning problem: in the worst case our procedure still guarantees optimal performance, but on easy instances it achieves much smaller regret. In particular, our adaptive method achieves constant regret in a probabilistic setting, when there exists an action that on average obtains strictly smaller loss than all other actions. We also provide a simulation study comparing our approach to existing methods.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا