أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Odalric-Ambrym Maillard

ترغب بنشر مسار تعليمي؟ اضغط هنا

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

مساحة جديدة

A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences

241 - Odalric-Ambrym Maillard , Gilles Stoltz (DMA 2011

We consider a Kullback-Leibler-based algorithm for the stochastic multi-armed bandit problem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic regret matches the lower bound of cite{Burnetas96}. Ou r contribution is to provide a finite-time analysis of this algorithm; we get bounds whose main terms are smaller than the ones of previously known algorithms with finite-time analyses (like UCB-type algorithms).

نظرية الإحصاء نظرية الإحصاء

جامعة قاسيون الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا

نعم | كلا