No Arabic abstract
Hypothesis Selection is a fundamental distribution learning problem where given a comparator-class $Q={q_1,ldots, q_n}$ of distributions, and a sampling access to an unknown target distribution $p$, the goal is to output a distribution $q$ such that $mathsf{TV}(p,q)$ is close to $opt$, where $opt = min_i{mathsf{TV}(p,q_i)}$ and $mathsf{TV}(cdot, cdot)$ denotes the total-variation distance. Despite the fact that this problem has been studied since the 19th century, its complexity in terms of basic resources, such as number of samples and approximation guarantees, remains unsettled (this is discussed, e.g., in the charming book by Devroye and Lugosi `00). This is in stark contrast with other (younger) learning settings, such as PAC learning, for which these complexities are well understood. We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem, outputting $q$ such that $mathsf{TV}(p,q) leq2 cdot opt + eps$, with a (nearly) optimal sample complexity of~$tilde O(log n/epsilon^2)$. This is the first algorithm that simultaneously achieves the best approximation factor and sample complexity: previously, Bousquet, Kane, and Moran (COLT `19) gave a learner achieving the optimal $2$-approximation, but with an exponentially worse sample complexity of $tilde O(sqrt{n}/epsilon^{2.5})$, and Yatracos~(Annals of Statistics `85) gave a learner with optimal sample complexity of $O(log n /epsilon^2)$ but with a sub-optimal approximation factor of $3$.
We study a fundamental problem in Bayesian learning, where the goal is to select a set of data sources with minimum cost while achieving a certain learning performance based on the data streams provided by the selected data sources. First, we show that the data source selection problem for Bayesian learning is NP-hard. We then show that the data source selection problem can be transformed into an instance of the submodular set covering problem studied in the literature, and provide a standard greedy algorithm to solve the data source selection problem with provable performance guarantees. Next, we propose a fast greedy algorithm that improves the running times of the standard greedy algorithm, while achieving performance guarantees that are comparable to those of the standard greedy algorithm. The fast greedy algorithm can also be applied to solve the general submodular set covering problem with performance guarantees. Finally, we validate the theoretical results using numerical examples, and show that the greedy algorithms work well in practice.
Since its introduction a decade ago, emph{relative entropy policy search} (REPS) has demonstrated successful policy learning on a number of simulated and real-world robotic domains, not to mention providing algorithmic components used by many recently proposed reinforcement learning (RL) algorithms. While REPS is commonly known in the community, there exist no guarantees on its performance when using stochastic and gradient-based solvers. In this paper we aim to fill this gap by providing guarantees and convergence rates for the sub-optimality of a policy learned using first-order optimization methods applied to the REPS objective. We first consider the setting in which we are given access to exact gradients and demonstrate how near-optimality of the objective translates to near-optimality of the policy. We then consider the practical setting of stochastic gradients, and introduce a technique that uses emph{generative} access to the underlying Markov decision process to compute parameter updates that maintain favorable convergence to the optimal regularized policy.
We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution $p$ and a set of $k$ probability distributions $mathcal{Q}$, we aim to output, under the constraints of $varepsilon$-local differential privacy, a distribution from $mathcal{Q}$ whose total variation distance to $p$ is comparable to the best such distribution. This is a generalization of the classic problem of $k$-wise simple hypothesis testing, which corresponds to when $p in mathcal{Q}$, and we wish to identify $p$. Absent privacy constraints, this problem requires $O(log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require $tilde O(k^2)$ samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least $Omega(k)$ samples. Second, for the special case of $k$-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring $tilde O(k)$ samples. Finally, we provide sequentially interactive algorithms for the general case, requiring $tilde O(k)$ samples and only $O(log log k)$ rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction $t$, as well as lower bounds showing that they are near-optimal for every $t$. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.
Persuasion studies how an informed principal may influence the behavior of agents by the strategic provision of payoff-relevant information. We focus on the fundamental multi-receiver model by Arieli and Babichenko (2019), in which there are no inter-agent externalities. Unlike prior works on this problem, we study the public persuasion problem in the general setting with: (i) arbitrary state spaces; (ii) arbitrary action spaces; (iii) arbitrary senders utility functions. We fully characterize the computational complexity of computing a bi-criteria approximation of an optimal public signaling scheme. In particular, we show, in a voting setting of independent interest, that solving this problem requires at least a quasi-polynomial number of steps even in settings with a binary action space, assuming the Exponential Time Hypothesis. In doing so, we prove that a relaxed version of the Maximum Feasible Subsystem of Linear Inequalities problem requires at least quasi-polynomial time to be solved. Finally, we close the gap by providing a quasi-polynomial time bi-criteria approximation algorithm for arbitrary public persuasion problems that, in specific settings, yields a QPTAS.
The performance of a machine learning system is usually evaluated by using i.i.d. observations with true labels. However, acquiring ground truth labels is expensive, while obtaining unlabeled samples may be cheaper. Stratified sampling can be beneficial in such settings and can reduce the number of true labels required without compromising the evaluation accuracy. Stratified sampling exploits statistical properties (e.g., variance) across strata of the unlabeled population, though usually under the unrealistic assumption that these properties are known. We propose two new algorithms that simultaneously estimate these properties and optimize the evaluation accuracy. We construct a lower bound to show the proposed algorithms (to log-factors) are rate optimal. Experiments on synthetic and real data show the reduction in label complexity that is enabled by our algorithms.