Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

139 0 0.0 ( 0 )

Download Cite

Added by Golnaz Ghiasi

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Golnaz Ghiasi - Yin Cui - Aravind Srinivas

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Building instance segmentation models that are data-efficient and can handle rare object categories is an important challenge in computer vision. Leveraging data augmentations is a promising direction towards addressing this challenge. Here, we perform a systematic study of the Copy-Paste augmentation ([13, 12]) for instance segmentation where we randomly paste objects onto an image. Prior studies on Copy-Paste relied on modeling the surrounding visual context for pasting the objects. However, we find that the simple mechanism of pasting objects randomly is good enough and can provide solid gains on top of strong baselines. Furthermore, we show Copy-Paste is additive with semi-supervised methods that leverage extra data through pseudo labeling (e.g. self-training). On COCO instance segmentation, we achieve 49.1 mask AP and 57.3 box AP, an improvement of +0.6 mask AP and +1.5 box AP over the previous state-of-the-art. We further demonstrate that Copy-Paste can lead to significant improvements on the LVIS benchmark. Our baseline model outperforms the LVIS 2020 Challenge winning entry by +3.6 mask AP on rare categories.

rate research

Copy and Paste: A Simple But Effective Initialization Method for Black-Box Adversarial Attacks

82 - Thomas Brunner , Frederik Diehl , Alois Knoll 2019

Many optimization methods for generating black-box adversarial examples have been proposed, but the aspect of initializing said optimizers has not been considered in much detail. We show that the choice of starting points is indeed crucial, and that the performance of state-of-the-art attacks depends on it. First, we discuss desirable properties of starting points for attacking image classifiers, and how they can be chosen to increase query efficiency. Notably, we find that simply copying small patches from other images is a valid strategy. We then present an evaluation on ImageNet that clearly demonstrates the effectiveness of this method: Our initialization scheme reduces the number of queries required for a state-of-the-art Boundary Attack by 81%, significantly outperforming previous results reported for targeted black-box adversarial examples.

Computer Vision and Pattern Recognition Cryptography and Security Machine Learning

A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

102 - Jianlong Yuan , Yifan Liu , Chunhua Shen 2021

Recently, significant progress has been made on semantic segmentation. However, the success of supervised semantic segmentation typically relies on a large amount of labelled data, which is time-consuming and costly to obtain. Inspired by the success of semi-supervised learning methods in image classification, here we propose a simple yet effective semi-supervised learning framework for semantic segmentation. We demonstrate that the devil is in the details: a set of simple design and training techniques can collectively improve the performance of semi-supervised semantic segmentation significantly. Previous works [3, 27] fail to employ strong augmentation in pseudo label learning efficiently, as the large distribution change caused by strong augmentation harms the batch normalisation statistics. We design a new batch normalisation, namely distribution-specific batch normalisation (DSBN) to address this problem and demonstrate the importance of strong augmentation for semantic segmentation. Moreover, we design a self correction loss which is effective in noise resistance. We conduct a series of ablation studies to show the effectiveness of each component. Our method achieves state-of-the-art results in the semi-supervised settings on the Cityscapes and Pascal VOC datasets.

Computer Vision and Pattern Recognition

CarveMix: A Simple Data Augmentation Method for Brain Lesion Segmentation

160 - Xinru Zhang , Chenghao Liu , Ni Ou 2021

Brain lesion segmentation provides a valuable tool for clinical diagnosis, and convolutional neural networks (CNNs) have achieved unprecedented success in the task. Data augmentation is a widely used strategy that improves the training of CNNs, and the design of the augmentation method for brain lesion segmentation is still an open problem. In this work, we propose a simple data augmentation approach, dubbed as CarveMix, for CNN-based brain lesion segmentation. Like other mix-based methods, such as Mixup and CutMix, CarveMix stochastically combines two existing labeled images to generate new labeled samples. Yet, unlike these augmentation strategies based on image combination, CarveMix is lesion-aware, where the combination is performed with an attention on the lesions and a proper annotation is created for the generated image. Specifically, from one labeled image we carve a region of interest (ROI) according to the lesion location and geometry, and the size of the ROI is sampled from a probability distribution. The carved ROI then replaces the corresponding voxels in a second labeled image, and the annotation of the second image is replaced accordingly as well. In this way, we generate new labeled images for network training and the lesion information is preserved. To evaluate the proposed method, experiments were performed on two brain lesion datasets. The results show that our method improves the segmentation accuracy compared with other simple data augmentation approaches.

Image and Video Processing Computer Vision and Pattern Recognition

Copy and Paste method based on Pose for Re-identification

78 - Cheng Yang 2021

The aim of re-identification is to match objects in surveillance cameras with different viewpoints. Although ReID is developing at a considerably rapid pace, there is currently no processing method for the ReID task in multiple scenarios. However, such processing method is required in real life scenarios, such as those involving security. In the present study, a new ReID scenario was explored, which differs in terms of perspective, background, and pose(walking or cycling). Obviously, ordinary ReID processing methods cannot effectively handle such a scenario, with the introduction of image datasets being the optimal solution, in addition to being considerably expensive. To solve the aforementioned problem, a simple and effective method to generate images in several new scenarios was proposed, which is names the Copy and Paste method based on Pose(CPP). The CPP method is based on key point detection, using copy as paste, to composite a new semantic image dataset in two different semantic image datasets. As an example, pedestrains and bicycles can be used to generate several images that show the same person riding on different bicycles. The CPP method is suitable for ReID tasks in new scenarios and outperforms the traditional methods when applied to the original datasets in original ReID tasks. To be specific, the CPP method can also perform better in terms of generalization for third-party public dataset. The Code and datasets composited by the CPP method will be available in the future.

Computer Vision and Pattern Recognition Artificial Intelligence

SOLO: A Simple Framework for Instance Segmentation

83 - Xinlong Wang , Rufeng Zhang , Chunhua Shen 2021

Compared to many other dense prediction tasks, e.g., semantic segmentation, it is the arbitrary number of instances that has made instance segmentation much more challenging. In order to predict a mask for each instance, mainstream approaches either follow the detect-then-segment strategy (e.g., Mask R-CNN), or predict embedding vectors first then cluster pixels into individual instances. In this paper, we view the task of instance segmentation from a completely new perspective by introducing the notion of instance categories, which assigns categories to each pixel within an instance according to the instances location. With this notion, we propose segmenting objects by locations (SOLO), a simple, direct, and fast framework for instance segmentation with strong performance. We derive a few SOLO variants (e.g., Vanilla SOLO, Decoupled SOLO, Dynamic SOLO) following the basic principle. Our method directly maps a raw input image to the desired object categories and instance masks, eliminating the need for the grouping post-processing or the bounding box detection. Our approach achieves state-of-the-art results for instance segmentation in terms of both speed and accuracy, while being considerably simpler than the existing methods. Besides instance segmentation, our method yields state-of-the-art results in object detection (from our mask byproduct) and panoptic segmentation. We further demonstrate the flexibility and high-quality segmentation of SOLO by extending it to perform one-stage instance-level image matting. Code is available at: https://git.io/AdelaiDet

Computer Vision and Pattern Recognition