This article compares ten recently proposed neural networks and proposes two ensemble neural network-based models for blood glucose prediction. All of them are tested under the same dataset, preprocessing workflow, and tools using the OhioT1DM Dataset at three different prediction horizons: 30, 60, and 120 minutes. We compare their performance using the most common metrics in blood glucose prediction and rank the best-performing ones using three methods devised for the statistical comparison of the performance of multiple algorithms: scmamp, model confidence set, and superior predictive ability. Our analysis highlights those models with the highest probability of being the best predictors, estimates the increase in error of the models that perform more poorly with respect to the best ones, and provides a guide for their use in clinical practice.