Object detection in aerial images is a challenging task due to the following reasons: (1) objects are small and dense relative to images; (2) the object scale varies in a wide range; (3) the number of object in different classes is imbalanced. Many current methods adopt cropping idea: splitting high resolution images into serials subregions (chips) and detecting on them. However, some problems such as scale variation, object sparsity, and class imbalance exist in the process of training network with chips. In this work, three augmentation methods are introduced to relieve these problems. Specifically, we propose a scale adaptive module, which dynamically adjusts chip size to balance object scale, narrowing scale variation in training. In addtion, we introduce mosaic to augment datasets, relieving object sparity problem. To balance catgory, we present mask resampling to paste object in chips with panoramic segmentation. Our model achieves state-of-the-art perfomance on two popular aerial image datasets of VisDrone and UAVDT. Remarkably, three methods can be independently applied to detectiors, increasing performance steady without the sacrifice of inference efficiency.