Unbiased estimators for the variance of MMD estimators


Abstract in English

The maximum mean discrepancy (MMD) is a kernel-based distance between probability distributions useful in many applications (Gretton et al. 2012), bearing a simple estimator with pleasing computational and statistical properties. Being able to efficiently estimate the variance of this estimator is very helpful to various problems in two-sample testing. Towards this end, Bounliphone et al. (2016) used the theory of U-statistics to derive estimators for the variance of an MMD estimator, and differences between two such estimators. Their estimator, however, drops lower-order terms, and is unnecessarily biased. We show in this note - extending and correcting work of Sutherland et al. (2017) - that we can find a truly unbiased estimator for the actual variance of both the squared MMD estimator and the difference of two correlated squared MMD estimators, at essentially no additional computational cost.

Download