Do you want to publish a course? Click here

Attentive CutMix: An Enhanced Data Augmentation Approach for Deep Learning Based Image Classification

87   0   0.0 ( 0 )
 Added by Devesh Walawalkar
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

Convolutional neural networks (CNN) are capable of learning robust representation with different regularization methods and activations as convolutional layers are spatially correlated. Based on this property, a large variety of regional dropout strategies have been proposed, such as Cutout, DropBlock, CutMix, etc. These methods aim to promote the network to generalize better by partially occluding the discriminative parts of objects. However, all of them perform this operation randomly, without capturing the most important region(s) within an object. In this paper, we propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix. In each training iteration, we choose the most descriptive regions based on the intermediate attention maps from a feature extractor, which enables searching for the most discriminative parts in an image. Our proposed method is simple yet effective, easy to implement and can boost the baseline significantly. Extensive experiments on CIFAR-10/100, ImageNet datasets with various CNN architectures (in a unified setting) demonstrate the effectiveness of our proposed method, which consistently outperforms the baseline CutMix and other methods by a significant margin.



rate research

Read More

Data augmentation is a key practice in machine learning for improving generalization performance. However, finding the best data augmentation hyperparameters requires domain knowledge or a computationally demanding search. We address this issue by proposing an efficient approach to automatically train a network that learns an effective distribution of transformations to improve its generalization. Using bilevel optimization, we directly optimize the data augmentation parameters using a validation set. This framework can be used as a general solution to learn the optimal data augmentation jointly with an end task model like a classifier. Results show that our joint training method produces an image classification accuracy that is comparable to or better than carefully hand-crafted data augmentation. Yet, it does not need an expensive external validation loop on the data augmentation hyperparameters.
68 - Feng Cen 2020
Due to the difficulty in acquiring massive task-specific occluded images, the classification of occluded images with deep convolutional neural networks (CNNs) remains highly challenging. To alleviate the dependency on large-scale occluded image datasets, we propose a novel approach to improve the classification accuracy of occluded images by fine-tuning the pre-trained models with a set of augmented deep feature vectors (DFVs). The set of augmented DFVs is composed of original DFVs and pseudo-DFVs. The pseudo-DFVs are generated by randomly adding difference vectors (DVs), extracted from a small set of clean and occluded image pairs, to the real DFVs. In the fine-tuning, the back-propagation is conducted on the DFV data flow to update the network parameters. The experiments on various datasets and network structures show that the deep feature augmentation significantly improves the classification accuracy of occluded images without a noticeable influence on the performance of clean images. Specifically, on the ILSVRC2012 dataset with synthetic occluded images, the proposed approach achieves 11.21% and 9.14% average increases in classification accuracy for the ResNet50 networks fine-tuned on the occlusion-exclusive and occlusion-inclusive training sets, respectively.
Data augmentation is an essential part of the training process applied to deep learning models. The motivation is that a robust training process for deep learning models depends on large annotated datasets, which are expensive to be acquired, stored and processed. Therefore a reasonable alternative is to be able to automatically generate new annotated training samples using a process known as data augmentation. The dominant data augmentation approach in the field assumes that new training samples can be obtained via random geometric or appearance transformations applied to annotated training samples, but this is a strong assumption because it is unclear if this is a reliable generative model for producing new training samples. In this paper, we provide a novel Bayesian formulation to data augmentation, where new annotated training points are treated as missing variables and generated based on the distribution learned from the training set. For learning, we introduce a theoretically sound algorithm --- generalised Monte Carlo expectation maximisation, and demonstrate one possible implementation via an extension of the Generative Adversarial Network (GAN). Classification results on MNIST, CIFAR-10 and CIFAR-100 show the better performance of our proposed method compared to the current dominant data augmentation approach mentioned above --- the results also show that our approach produces better classification results than similar GAN models.
Deep learning has shown great promise for CT image reconstruction, in particular to enable low dose imaging and integrated diagnostics. These merits, however, stand at great odds with the low availability of diverse image data which are needed to train these neural networks. We propose to overcome this bottleneck via a deep reinforcement learning (DRL) approach that is integrated with a style-transfer (ST) methodology, where the DRL generates the anatomical shapes and the ST synthesizes the texture detail. We show that our method bears high promise for generating novel and anatomically accurate high resolution CT images at large and diverse quantities. Our approach is specifically designed to work with even small image datasets which is desirable given the often low amount of image data many researchers have available to them.
We propose a deep-learning-based classification of data pages used in holographic memory. We numerically investigated the classification performance of a conventional multi-layer perceptron (MLP) and a deep neural network, under the condition that reconstructed page data are contaminated by some noise and are randomly laterally shifted. The MLP was found to have a classification accuracy of 91.58%, whereas the deep neural network was able to classify data pages at an accuracy of 99.98%. The accuracy of the deep neural network is two orders of magnitude better than the MLP.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا