Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Extreme Bandits using Robust Statistics

135 0 0.0 ( 0 )

Download Cite

Added by Ping Li

Publication date 2021

fields Mathematical Statistics Informatics Engineering

and research's language is English

Authors Sujay Bhatt - Ping Li - Gennady Samorodnitsky

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider a multi-armed bandit problem motivated by situations where only the extreme values, as opposed to expected values in the classical bandit setting, are of interest. We propose distribution free algorithms using robust statistics and characterize the statistical properties. We show that the provided algorithms achieve vanishing extremal regret under weaker conditions than existing algorithms. Performance of the algorithms is demonstrated for the finite-sample setting using numerical experiments. The results show superior performance of the proposed algorithms compared to the well known algorithms.

rate research

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

119 - Qin Ding , Cho-Jui Hsieh , James Sharpnack 2021

Stochastic linear contextual bandit algorithms have substantial applications in practice, such as recommender systems, online advertising, clinical trials, etc. Recent works show that optimal bandit algorithms are vulnerable to adversarial attacks and can fail completely in the presence of attacks. Existing robust bandit algorithms only work for the non-contextual setting under the attack of rewards and cannot improve the robustness in the general and popular contextual bandit environment. In addition, none of the existing methods can defend against attacked context. In this work, we provide the first robust bandit algorithm for stochastic linear contextual bandit setting under a fully adaptive and omniscient attack. Our algorithm not only works under the attack of rewards, but also under attacked context. Moreover, it does not need any information about the attack budget or the particular form of the attack. We provide theoretical guarantees for our proposed algorithm and show by extensive experiments that our proposed algorithm significantly improves the robustness against various kinds of popular attacks.

Machine Learning Machine Learning

Generalized Huber Loss for Robust Learning and its Efficient Minimization for a Robust Statistics

153 - Kaan Gokcesu , Hakan Gokcesu 2021

We propose a generalized formulation of the Huber loss. We show that with a suitable function of choice, specifically the log-exp transform; we can achieve a loss function which combines the desirable properties of both the absolute and the quadratic loss. We provide an algorithm to find the minimizer of such loss functions and show that finding a centralizing metric is not that much harder than the traditional mean and median.

Machine Learning Machine Learning

Rotting Bandits

87 - Nir Levine , Koby Crammer , Shie Mannor 2017

The Multi-Armed Bandits (MAB) framework highlights the tension between acquiring new knowledge (Exploration) and leveraging available knowledge (Exploitation). In the classical MAB problem, a decision maker must choose an arm at each time step, upon which she receives a reward. The decision makers objective is to maximize her cumulative expected reward over the time horizon. The MAB problem has been studied extensively, specifically under the assumption of the arms rewards distributions being stationary, or quasi-stationary, over time. We consider a variant of the MAB framework, which we termed Rotting Bandits, where each arms expected reward decays as a function of the number of times it has been pulled. We are motivated by many real-world scenarios such as online advertising, content recommendation, crowdsourcing, and more. We present algorithms, accompanied by simulations, and derive theoretical guarantees.

Machine Learning Machine Learning

Towards Practical Lipschitz Bandits

170 - Tianyu Wang , Weicheng Ye , Dawei Geng 2019

Stochastic Lipschitz bandit algorithms balance exploration and exploitation, and have been used for a variety of important task domains. In this paper, we present a framework for Lipschitz bandit methods that adaptively learns partitions of context- and arm-space. Due to this flexibility, the algorithm is able to efficiently optimize rewards and minimize regret, by focusing on the portions of the space that are most relevant. In our analysis, we link tree-based methods to Gaussian processes. In light of our analysis, we design a novel hierarchical Bayesian model for Lipschitz bandit problems. Our experiments show that our algorithms can achieve state-of-the-art performance in challenging real-world tasks such as neural network hyperparameter tuning.

Machine Learning Machine Learning

Multi-Player Bandits Revisited

102 - Lilian Besson , Emilie Kaufmann (CRIStAL 2017

Multi-player Multi-Armed Bandits (MAB) have been extensively studied in the literature, motivated by applications to Cognitive Radio systems. Driven by such applications as well, we motivate the introduction of several levels of feedback for multi-player MAB algorithms. Most existing work assume that sensing information is available to the algorithm. Under this assumption, we improve the state-of-the-art lower bound for the regret of any decentralized algorithms and introduce two algorithms, RandTopM and MCTopM, that are shown to empirically outperform existing algorithms. Moreover, we provide strong theoretical guarantees for these algorithms, including a notion of asymptotic optimality in terms of the number of selections of bad arms. We then introduce a promising heuristic, called Selfish, that can operate without sensing information, which is crucial for emerging applications to Internet of Things networks. We investigate the empirical performance of this algorithm and provide some first theoretical elements for the understanding of its behavior.

Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Extreme Bandits using Robust Statistics

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions