ترغب بنشر مسار تعليمي؟ اضغط هنا

A Histogram Thresholding Improvement to Mask R-CNN for Scalable Segmentation of New and Old Rural Buildings

338   0   0.0 ( 0 )
 نشر من قبل Weipan Xu
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

Mapping new and old buildings are of great significance for understanding socio-economic development in rural areas. In recent years, deep neural networks have achieved remarkable building segmentation results in high-resolution remote sensing images. However, the scarce training data and the varying geographical environments have posed challenges for scalable building segmentation. This study proposes a novel framework based on Mask R-CNN, named HTMask R-CNN, to extract new and old rural buildings even when the label is scarce. The framework adopts the result of single-object instance segmentation from the orthodox Mask R-CNN. Further, it classifies the rural buildings into new and old ones based on a dynamic grayscale threshold inferred from the result of a two-object instance segmentation task where training data is scarce. We found that the framework can extract more buildings and achieve a much higher mean Average Precision (mAP) than the orthodox Mask R-CNN model. We tested the novel frameworks performance with increasing training data and found that it converged even when the training samples were limited. This frameworks main contribution is to allow scalable segmentation by using significantly fewer training samples than traditional machine learning practices. That makes mapping Chinas new and old rural buildings viable.



قيم البحث

اقرأ أيضاً

Letting a deep network be aware of the quality of its own predictions is an interesting yet important problem. In the task of instance segmentation, the confidence of instance classification is used as mask quality score in most instance segmentation frameworks. However, the mask quality, quantified as the IoU between the instance mask and its ground truth, is usually not well correlated with classification score. In this paper, we study this problem and propose Mask Scoring R-CNN which contains a network block to learn the quality of the predicted instance masks. The proposed network block takes the instance feature and the corresponding predicted mask together to regress the mask IoU. The mask scoring strategy calibrates the misalignment between mask quality and mask score, and improves instance segmentation performance by prioritizing more accurate mask predictions during COCO AP evaluation. By extensive evaluations on the COCO dataset, Mask Scoring R-CNN brings consistent and noticeable gain with different models, and outperforms the state-of-the-art Mask R-CNN. We hope our simple and effective approach will provide a new direction for improving instance segmentation. The source code of our method is available at url{https://github.com/zjhuang22/maskscoring_rcnn}.
Tremendous efforts have been made to improve mask localization accuracy in instance segmentation. Modern instance segmentation methods relying on fully convolutional networks perform pixel-wise classification, which ignores object boundaries and shap es, leading coarse and indistinct mask prediction results and imprecise localization. To remedy these problems, we propose a conceptually simple yet effective Boundary-preserving Mask R-CNN (BMask R-CNN) to leverage object boundary information to improve mask localization accuracy. BMask R-CNN contains a boundary-preserving mask head in which object boundary and mask are mutually learned via feature fusion blocks. As a result, the predicted masks are better aligned with object boundaries. Without bells and whistles, BMask R-CNN outperforms Mask R-CNN by a considerable margin on the COCO dataset; in the Cityscapes dataset, there are more accurate boundary groundtruths available, so that BMask R-CNN obtains remarkable improvements over Mask R-CNN. Besides, it is not surprising to observe that BMask R-CNN obtains more obvious improvement when the evaluation criterion requires better localization (e.g., AP$_{75}$) as shown in Fig.1. Code and models are available at url{https://github.com/hustvl/BMaskR-CNN}.
In automated driving systems (ADS) and advanced driver-assistance systems (ADAS), an efficient road segmentation is necessary to perceive the drivable region and build an occupancy map for path planning. The existing algorithms implement gigantic con volutional neural networks (CNNs) that are computationally expensive and time consuming. In this paper, we introduced distributed LSTM, a neural network widely used in audio and video processing, to process rows and columns in images and feature maps. We then propose a new network combining the convolutional and distributed LSTM layers to solve the road segmentation problem. In the end, the network is trained and tested in KITTI road benchmark. The result shows that the combined structure enhances the feature extraction and processing but takes less processing time than pure CNN structure.
96 - Hao Li , Aozhou Wu , Wen Fang 2020
Resonant Beam Charging (RBC) is a wireless charging technology which supports multi-watt power transfer over meter-level distance. The features of safety, mobility and simultaneous charging capability enable RBC to charge multiple mobile devices safe ly at the same time. To detect the devices that need to be charged, a Mask R-CNN based dection model is proposed in previous work. However, considering the constraints of the RBC system, its not easy to apply Mask R-CNN in lightweight hardware-embedded devices because of its heavy model and huge computation. Thus, we propose a machine learning detection approach which provides a lighter and faster model based on traditional Mask R-CNN. The proposed approach makes the object detection much easier to be transplanted on mobile devices and reduce the burden of hardware computation. By adjusting the structure of the backbone and the head part of Mask R-CNN, we reduce the average detection time from $1.02mbox{s}$ per image to $0.6132mbox{s}$, and reduce the model size from $245mbox{MB}$ to $47.1mbox{MB}$. The improved model is much more suitable for the application in the RBC system.
Obtaining precise instance segmentation masks is of high importance in many modern applications such as robotic manipulation and autonomous driving. Currently, many state of the art models are based on the Mask R-CNN framework which, while very power ful, outputs masks at low resolutions which could result in imprecise boundaries. On the other hand, classic variational methods for segmentation impose desirable global and local data and geometry constraints on the masks by optimizing an energy functional. While mathematically elegant, their direct dependence on good initialization, non-robust image cues and manual setting of hyperparameters renders them unsuitable for modern applications. We propose LevelSet R-CNN, which combines the best of both worlds by obtaining powerful feature representations that are combined in an end-to-end manner with a variational segmentation framework. We demonstrate the effectiveness of our approach on COCO and Cityscapes datasets.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا