Improper or erroneous labelling can pose a hindrance to reliable generalization for supervised learning. This can have negative consequences, especially for critical fields such as healthcare. We propose an effective new approach for learning under extreme label noise, based on under-trained deep ensembles. Each ensemble member is trained with a subset of the training data, to acquire a general overview of the decision boundary separation, without focusing on potentially erroneous details. The accumulated knowledge of the ensemble is combined to form new labels, that determine a better class separation than the original labels. A new model is trained with these labels to generalize reliably despite the label noise. We focus on a healthcare setting and extensively evaluate our approach on the task of sleep apnea detection. For comparison with related work, we additionally evaluate on the task of digit recognition. In our experiments, we observed performance improvement in accuracy from 6.7% up-to 49.3% for the task of digit classification and in kappa from 0.02 up-to 0.55 for the task of sleep apnea detection.