Hierarchical Paired Channel Fusion Network for Street Scene Change Detection

327 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Pingping Zhang Dr

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yinjie Lei - Duo Peng - Pingping Zhang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Street Scene Change Detection (SSCD) aims to locate the changed regions between a given street-view image pair captured at different times, which is an important yet challenging task in the computer vision community. The intuitive way to solve the SSCD task is to fuse the extracted image feature pairs, and then directly measure the dissimilarity parts for producing a change map. Therefore, the key for the SSCD task is to design an effective feature fusion method that can improve the accuracy of the corresponding change maps. To this end, we present a novel Hierarchical Paired Channel Fusion Network (HPCFNet), which utilizes the adaptive fusion of paired feature channels. Specifically, the features of a given image pair are jointly extracted by a Siamese Convolutional Neural Network (SCNN) and hierarchically combined by exploring the fusion of channel pairs at multiple feature levels. In addition, based on the observation that the distribution of scene changes is diverse, we further propose a Multi-Part Feature Learning (MPFL) strategy to detect diverse changes. Based on the MPFL strategy, our framework achieves a novel approach to adapt to the scale and location diversities of the scene change regions. Extensive experiments on three public datasets (i.e., PCD, VL-CMU-CD and CDnet2014) demonstrate that the proposed framework achieves superior performance which outperforms other state-of-the-art methods with a considerable margin.

قيم البحث

117 - Kento Doi , Ryuhei Hamaguchi , Shun Iwase 2020

This paper describes a viewpoint-robust object-based change detection network (OBJ-CDNet). Mobile cameras such as drive recorders capture images from different viewpoints each time due to differences in camera trajectory and shutter timing. However, previous methods for pixel-wise change detection are vulnerable to the viewpoint differences because they assume aligned image pairs as inputs. To cope with the difficulty, we introduce a deep graph matching network that establishes object correspondence between an image pair. The introduction enables us to detect object-wise scene changes without precise image alignment. For more accurate object matching, we propose an epipolar-guided deep graph matching network (EGMNet), which incorporates the epipolar constraint into the deep graph matching layer used in OBJCDNet. To evaluate our networks robustness against viewpoint differences, we created synthetic and real datasets for scene change detection from an image pair. The experimental results verified the effectiveness of our network.

الرؤية الحاسوبية وتمييز الأنماط

Extreme Channel Prior Embedded Network for Dynamic Scene Deblurring

238 - Jianrui Cai , Wangmeng Zuo , Lei Zhang 2019

Recent years have witnessed the significant progress on convolutional neural networks (CNNs) in dynamic scene deblurring. While CNN models are generally learned by the reconstruction loss defined on training data, incorporating suitable image priors as well as regularization terms into the network architecture could boost the deblurring performance. In this work, we propose an Extreme Channel Prior embedded Network (ECPeNet) to plug the extreme channel priors (i.e., priors on dark and bright channels) into a network architecture for effective dynamic scene deblurring. A novel trainable extreme channel prior embedded layer (ECPeL) is developed to aggregate both extreme channel and blurry image representations, and sparse regularization is introduced to regularize the ECPeNet model learning. Furthermore, we present an effective multi-scale network architecture that works in both coarse-to-fine and fine-to-coarse manners for better exploiting information flow across scales. Experimental results on GoPro and Kohler datasets show that our proposed ECPeNet performs favorably against state-of-the-art deep image deblurring methods in terms of both quantitative metrics and visual quality.

الرؤية الحاسوبية وتمييز الأنماط

Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition

91 - Di Hu , Xuhong Li , Lichao Mou 2020

Aerial scene recognition is a fundamental task in remote sensing and has recently received increased interest. While the visual information from overhead images with powerful models and efficient algorithms yields considerable performance on scene re cognition, it still suffers from the variation of ground objects, lighting conditions etc. Inspired by the multi-channel perception theory in cognition science, in this paper, for improving the performance on the aerial scene recognition, we explore a novel audiovisual aerial scene recognition task using both images and sounds as input. Based on an observation that some specific sound events are more likely to be heard at a given geographic location, we propose to exploit the knowledge from the sound events to improve the performance on the aerial scene recognition. For this purpose, we have constructed a new dataset named AuDio Visual Aerial sceNe reCognition datasEt (ADVANCE). With the help of this dataset, we evaluate three proposed approaches for transferring the sound event knowledge to the aerial scene recognition task in a multimodal learning framework, and show the benefit of exploiting the audio information for the aerial scene recognition. The source code is publicly available for reproducibility purposes.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي الوسائط المتعددة

Weakly Supervised Silhouette-based Semantic Scene Change Detection

116 - Ken Sakurada , Mikiya Shibuya , Weimin Wang 2018

This paper presents a novel semantic scene change detection scheme with only weak supervision. A straightforward approach for this task is to train a semantic change detection network directly from a large-scale dataset in an end-to-end manner. Howev er, a specific dataset for this task, which is usually labor-intensive and time-consuming, becomes indispensable. To avoid this problem, we propose to train this kind of network from existing datasets by dividing this task into change detection and semantic extraction. On the other hand, the difference in camera viewpoints, for example, images of the same scene captured from a vehicle-mounted camera at different time points, usually brings a challenge to the change detection task. To address this challenge, we propose a new siamese network structure with the introduction of correlation layer. In addition, we create a publicly available dataset for semantic change detection to evaluate the proposed method. The experimental results verified both the robustness to viewpoint difference in change detection task and the effectiveness for semantic change detection of the proposed networks. Our code and dataset are available at https://github.com/xdspacelab/sscdnet.

الرؤية الحاسوبية وتمييز الأنماط

Urban Change Detection by Fully Convolutional Siamese Concatenate Network with Attention

62 - Farnoosh Heidary , Mehran Yazdi , Maryam Dehghani 2021

Change detection (CD) is an important problem in remote sensing, especially in disaster time for urban management. Most existing traditional methods for change detection are categorized based on pixel or objects. Object-based models are preferred to pixel-based methods for handling very high-resolution remote sensing (VHR RS) images. Such methods can benefit from the ongoing research on deep learning. In this paper, a fully automatic change-detection algorithm on VHR RS images is proposed that deploys Fully Convolutional Siamese Concatenate networks (FC-Siam-Conc). The proposed method uses preprocessing and an attention gate layer to improve accuracy. Gaussian attention (GA) as a soft visual attention mechanism is used for preprocessing. GA helps the network to handle feature maps like biological visual systems. Since the GA parameters cannot be adjusted during network training, an attention gate layer is introduced to play the role of GA with parameters that can be tuned among other network parameters. Experimental results obtained on Onera Satellite Change Detection (OSCD) and RIVER-CD datasets confirm the superiority of the proposed architecture over the state-of-the-art algorithms.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو