Do you want to publish a course? Click here

Sharp regret bounds for empirical Bayes and compound decision problems

164   0   0.0 ( 0 )
 Added by Yury Polyanskiy
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

We consider the classical problems of estimating the mean of an $n$-dimensional normally (with identity covariance matrix) or Poisson distributed vector under the squared loss. In a Bayesian setting the optimal estimator is given by the prior-dependent conditional mean. In a frequentist setting various shrinkage methods were developed over the last century. The framework of empirical Bayes, put forth by Robbins (1956), combines Bayesian and frequentist mindsets by postulating that the parameters are independent but with an unknown prior and aims to use a fully data-driven estimator to compete with the Bayesian oracle that knows the true prior. The central figure of merit is the regret, namely, the total excess risk over the Bayes risk in the worst case (over the priors). Although this paradigm was introduced more than 60 years ago, little is known about the asymptotic scaling of the optimal regret in the nonparametric setting. We show that for the Poisson model with compactly supported and subexponential priors, the optimal regret scales as $Theta((frac{log n}{loglog n})^2)$ and $Theta(log^3 n)$, respectively, both attained by the original estimator of Robbins. For the normal mean model, the regret is shown to be at least $Omega((frac{log n}{loglog n})^2)$ and $Omega(log^2 n)$ for compactly supported and subgaussian priors, respectively, the former of which resolves the conjecture of Singh (1979) on the impossibility of achieving bounded regret; before this work, the best regret lower bound was $Omega(1)$. In addition to the empirical Bayes setting, these results are shown to hold in the compound setting where the parameters are deterministic. As a side application, the construction in this paper also leads to improved or new lower bounds for density estimation of Gaussian and Poisson mixtures.



rate research

Read More

Most bandit algorithm designs are purely theoretical. Therefore, they have strong regret guarantees, but also are often too conservative in practice. In this work, we pioneer the idea of algorithm design by minimizing the empirical Bayes regret, the average regret over problem instances sampled from a known distribution. We focus on a tractable instance of this problem, the confidence interval and posterior width tuning, and propose an efficient algorithm for solving it. The tuning algorithm is analyzed and evaluated in multi-armed, linear, and generalized linear bandits. We report several-fold reductions in Bayes regret for state-of-the-art bandit algorithms, simply by optimizing over a small sample from a distribution.
Record linkage involves merging records in large, noisy databases to remove duplicate entities. It has become an important area because of its widespread occurrence in bibliometrics, public health, official statistics production, political science, and beyond. Traditional linkage methods directly linking records to one another are computationally infeasible as the number of records grows. As a result, it is increasingly common for researchers to treat record linkage as a clustering task, in which each latent entity is associated with one or more noisy database records. We critically assess performance bounds using the Kullback-Leibler (KL) divergence under a Bayesian record linkage framework, making connections to Kolchin partition models. We provide an upper bound using the KL divergence and a lower bound on the minimum probability of misclassifying a latent entity. We give insights for when our bounds hold using simulated data and provide practical user guidance.
We establish a phase transition known as the all-or-nothing phenomenon for noiseless discrete channels. This class of models includes the Bernoulli group testing model and the planted Gaussian perceptron model. Previously, the existence of the all-or-nothing phenomenon for such models was only known in a limited range of parameters. Our work extends the results to all signals with arbitrary sublinear sparsity. Over the past several years, the all-or-nothing phenomenon has been established in various models as an outcome of two seemingly disjoint results: one positive result establishing the all half of all-or-nothing, and one impossibility result establishing the nothing half. Our main technique in the present work is to show that for noiseless discrete channels, the all half implies the nothing half, that is a proof of all can be turned into a proof of nothing. Since the all half can often be proven by straightforward means -- for instance, by the first-moment method -- our equivalence gives a powerful and general approach towards establishing the existence of this phenomenon in other contexts.
This paper studies the problem of recovering the hidden vertex correspondence between two edge-correlated random graphs. We focus on the Gaussian model where the two graphs are complete graphs with correlated Gaussian weights and the ErdH{o}s-Renyi model where the two graphs are subsampled from a common parent ErdH{o}s-Renyi graph $mathcal{G}(n,p)$. For dense graphs with $p=n^{-o(1)}$, we prove that there exists a sharp threshold, above which one can correctly match all but a vanishing fraction of vertices and below which correctly matching any positive fraction is impossible, a phenomenon known as the all-or-nothing phase transition. Even more strikingly, in the Gaussian setting, above the threshold all vertices can be exactly matched with high probability. In contrast, for sparse ErdH{o}s-Renyi graphs with $p=n^{-Theta(1)}$, we show that the all-or-nothing phenomenon no longer holds and we determine the thresholds up to a constant factor. Along the way, we also derive the sharp threshold for exact recovery, sharpening the existing results in ErdH{o}s-Renyi graphs. The proof of the negative results builds upon a tight characterization of the mutual information based on the truncated second-moment computation and an area theorem that relates the mutual information to the integral of the reconstruction error. The positive results follows from a tight analysis of the maximum likelihood estimator that takes into account the cycle structure of the induced permutation on the edges.
Consider a population consisting of n individuals, each of whom has one of d types (e.g. their blood type, in which case d=4). We are allowed to query this database by specifying a subset of the population, and in response we observe a noiseless histogram (a d-dimensional vector of counts) of types of the pooled individuals. This measurement model arises in practical situations such as pooling of genetic data and may also be motivated by privacy considerations. We are interested in the number of queries one needs to unambiguously determine the type of each individual. In this paper, we study this information-theoretic question under the random, dense setting where in each query, a random subset of individuals of size proportional to n is chosen. This makes the problem a particular example of a random constraint satisfaction problem (CSP) with a planted solution. We establish almost matching upper and lower bounds on the minimum number of queries m such that there is no solution other than the planted one with probability tending to 1 as n tends to infinity. Our proof relies on the computation of the exact annealed free energy of this model in the thermodynamic limit, which corresponds to the exponential rate of decay of the expected number of solution to this planted CSP. As a by-product of the analysis, we show an identity of independent interest relating the Gaussian integral over the space of Eulerian flows of a graph to its spanning tree polynomial.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا