Recently, there has been a growing interest in the problem of learning rich implicit models - those from which we can sample, but can not evaluate their density. These models apply some parametric function, such as a deep network, to a base measure, and are learned end-to-end using stochastic optimization. One strategy of devising a loss function is through the statistics of two sample tests - if we can fool a statistical test, the learned distribution should be a good model of the true data. However, not all tests can easily fit into this framework, as they might not be differentiable with respect to the data points, and hence with respect to the parameters of the implicit model. Motivated by this problem, in this paper we show how two such classical tests, the Friedman-Rafsky and k-nearest neighbour tests, can be effectively smoothed using ideas from undirected graphical models - the matrix tree theorem and cardinality potentials. Moreover, as we show experimentally, smoothing can significantly increase the power of the test, which might of of independent interest. Finally, we apply our method to learn implicit models.