Data Augmentation for Histopathological Images Based on Gaussian-Laplacian Pyramid Blending


Abstract in English

Data imbalance is a major problem that affects several machine learning (ML) algorithms. Such a problem is troublesome because most of the ML algorithms attempt to optimize a loss function that does not take into account the data imbalance. Accordingly, the ML algorithm simply generates a trivial model that is biased toward predicting the most frequent class in the training data. In the case of histopathologic images (HIs), both low-level and high-level data augmentation (DA) techniques still present performance issues when applied in the presence of inter-patient variability; whence the model tends to learn color representations, which is related to the staining process. In this paper, we propose a novel approach capable of not only augmenting HI dataset but also distributing the inter-patient variability by means of image blending using the Gaussian-Laplacian pyramid. The proposed approach consists of finding the Gaussian pyramids of two images of different patients and finding the Laplacian pyramids thereof. Afterwards, the left-half side and the right-half side of different HIs are joined in each level of the Laplacian pyramid, and from the joint pyramids, the original image is reconstructed. This composition combines the stain variation of two patients, avoiding that color differences mislead the learning process. Experimental results on the BreakHis dataset have shown promising gains vis-a-vis the majority of DA techniques presented in the literature.

Download