Histogram Transform Ensembles for Density Estimation


Abstract in English

We investigate an algorithm named histogram transform ensembles (HTE) density estimator whose effectiveness is supported by both solid theoretical analysis and significant experimental performance. On the theoretical side, by decomposing the error term into approximation error and estimation error, we are able to conduct the following analysis: First of all, we establish the universal consistency under $L_1(mu)$-norm. Secondly, under the assumption that the underlying density function resides in the H{o}lder space $C^{0,alpha}$, we prove almost optimal convergence rates for both single and ensemble density estimators under $L_1(mu)$-norm and $L_{infty}(mu)$-norm for different tail distributions, whereas in contrast, for its subspace $C^{1,alpha}$ consisting of smoother functions, almost optimal convergence rates can only be established for the ensembles and the lower bound of the single estimators illustrates the benefits of ensembles over single density estimators. In the experiments, we first carry out simulations to illustrate that histogram transform ensembles surpass single histogram transforms, which offers powerful evidence to support the theoretical results in the space $C^{1,alpha}$. Moreover, to further exert the experimental performances, we propose an adaptive version of HTE and study the parameters by generating several synthetic datasets with diversities in dimensions and distributions. Last but not least, real data experiments with other state-of-the-art density estimators demonstrate the accuracy of the adaptive HTE algorithm.

Download