ﻻ يوجد ملخص باللغة العربية
We give an $(varepsilon,delta)$-differentially private algorithm for the multi-armed bandit (MAB) problem in the shuffle model with a distribution-dependent regret of $Oleft(left(sum_{ain [k]:Delta_a>0}frac{log T}{Delta_a}right)+frac{ksqrt{logfrac{1}{delta}}log T}{varepsilon}right)$, and a distribution-independent regret of $Oleft(sqrt{kTlog T}+frac{ksqrt{logfrac{1}{delta}}log T}{varepsilon}right)$, where $T$ is the number of rounds, $Delta_a$ is the suboptimality gap of the arm $a$, and $k$ is the total number of arms. Our upper bound almost matches the regret of the best known algorithms for the centralized model, and significantly outperforms the best known algorithm in the local model.
Federated Learning (FL) is a promising machine learning paradigm that enables the analyzer to train a model without collecting users raw data. To ensure users privacy, differentially private federated learning has been intensively studied. The existi
In this paper we study the problem of stochastic multi-armed bandits (MAB) in the (local) differential privacy (DP/LDP) model. Unlike the previous results which need to assume bounded reward distributions, here we mainly focus on the case the reward
We study the best-arm identification problem in multi-armed bandits with stochastic, potentially private rewards, when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a (non-private) successiv
We study locally differentially private (LDP) bandits learning in this paper. First, we propose simple black-box reduction frameworks that can solve a large family of context-free bandits learning problems with LDP guarantee. Based on our frameworks,
In this paper, we study Combinatorial Semi-Bandits (CSB) that is an extension of classic Multi-Armed Bandits (MAB) under Differential Privacy (DP) and stronger Local Differential Privacy (LDP) setting. Since the server receives more information from