BiconNet: An Edge-preserved Connectivity-based Approach for Salient Object Detection

153 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ziyun Yang

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ziyun Yang - Somayyeh Soltanian-Zadeh - Sina Farsiu

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي معالجة الصور والفيديو

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Salient object detection (SOD) is viewed as a pixel-wise saliency modeling task by traditional deep learning-based methods. A limitation of current SOD models is insufficient utilization of inter-pixel information, which usually results in imperfect segmentation near edge regions and low spatial coherence. As we demonstrate, using a saliency mask as the only label is suboptimal. To address this limitation, we propose a connectivity-based approach called bilateral connectivity network (BiconNet), which uses connectivity masks together with saliency masks as labels for effective modeling of inter-pixel relationships and object saliency. Moreover, we propose a bilateral voting module to enhance the output connectivity map, and a novel edge feature enhancement method that efficiently utilizes edge-specific features. Through comprehensive experiments on five benchmark datasets, we demonstrate that our proposed method can be plugged into any existing state-of-the-art saliency-based SOD framework to improve its performance with negligible parameter increase.

قيم البحث

145 - Han Sun , Yetong Bian , Ningzhong Liu 2021

Deep-learning based salient object detection methods achieve great improvements. However, there are still problems existing in the predictions, such as blurry boundary and inaccurate location, which is mainly caused by inadequate feature extraction a nd integration. In this paper, we propose a Multi-scale Edge-based U-shape Network (MEUN) to integrate various features at different scales to achieve better performance. To extract more useful information for boundary prediction, U-shape Edge Network modules are embedded in each decoder units. Besides, the additional down-sampling module alleviates the location inaccuracy. Experimental results on four benchmark datasets demonstrate the validity and reliability of the proposed method. Multi-scale Edge based U-shape Network also shows its superiority when compared with 15 state-of-the-art salient object detection methods.

الرؤية الحاسوبية وتمييز الأنماط

Salient Object Ranking with Position-Preserved Attention

81 - Hao Fang , Daoxin Zhang , Yi Zhang 2021

Instance segmentation can detect where the objects are in an image, but hard to understand the relationship between them. We pay attention to a typical relationship, relative saliency. A closely related task, salient object detection, predicts a bina ry map highlighting a visually salient region while hard to distinguish multiple objects. Directly combining two tasks by post-processing also leads to poor performance. There is a lack of research on relative saliency at present, limiting the practical applications such as content-aware image cropping, video summary, and image labeling. In this paper, we study the Salient Object Ranking (SOR) task, which manages to assign a ranking order of each detected object according to its visual saliency. We propose the first end-to-end framework of the SOR task and solve it in a multi-task learning fashion. The framework handles instance segmentation and salient object ranking simultaneously. In this framework, the SOR branch is independent and flexible to cooperate with different detection methods, so that easy to use as a plugin. We also introduce a Position-Preserved Attention (PPA) module tailored for the SOR branch. It consists of the position embedding stage and feature interaction stage. Considering the importance of position in saliency comparison, we preserve absolute coordinates of objects in ROI pooling operation and then fuse positional information with semantic features in the first stage. In the feature interaction stage, we apply the attention mechanism to obtain proposals contextualized representations to predict their relative ranking orders. Extensive experiments have been conducted on the ASR dataset. Without bells and whistles, our proposed method outperforms the former state-of-the-art method significantly. The code will be released publicly available.

الرؤية الحاسوبية وتمييز الأنماط

Attention-based Assisted Excitation for Salient Object Detection

117 - Saeed Masoudnia , Melika Kheirieh , Abdol-Hossein Vahabie 2020

Visual attention brings significant progress for Convolution Neural Networks (CNNs) in various applications. In this paper, object-based attention in human visual cortex inspires us to introduce a mechanism for modification of activations in feature maps of CNNs. In this mechanism, the activations of object locations are excited in feature maps. This mechanism is specifically inspired by attention-based gain modulation in object-based attention in brain. It facilitates figure-ground segregation in the visual cortex. Similar to brain, we use the idea to address two challenges in salient object detection: gathering object interior parts while segregation from background with concise boundaries. We implement the object-based attention in the U-net model using different architectures in the encoder parts, including AlexNet, VGG, and ResNet. The proposed method was examined on three benchmark datasets: HKU-IS, MSRB, and PASCAL-S. Experimental results showed that our inspired method could significantly improve the results in terms of mean absolute error and F-measure. The results also showed that our proposed method better captured not only the boundary but also the object interior. Thus, it can tackle the mentioned challenges.

الرؤية الحاسوبية وتمييز الأنماط

Edge-guided Non-local Fully Convolutional Network for Salient Object Detection

87 - Zhengzheng Tu , Yan Ma , Chenglong Li 2019

Fully Convolutional Neural Network (FCN) has been widely applied to salient object detection recently by virtue of high-level semantic feature extraction, but existing FCN based methods still suffer from continuous striding and pooling operations lea ding to loss of spatial structure and blurred edges. To maintain the clear edge structure of salient objects, we propose a novel Edge-guided Non-local FCN (ENFNet) to perform edge guided feature learning for accurate salient object detection. In a specific, we extract hierarchical global and local information in FCN to incorporate non-local features for effective feature representations. To preserve good boundaries of salient objects, we propose a guidance block to embed edge prior knowledge into hierarchical feature maps. The guidance block not only performs feature-wise manipulation but also spatial-wise transformation for effective edge embeddings. Our model is trained on the MSRA-B dataset and tested on five popular benchmark datasets. Comparing with the state-of-the-art methods, the proposed method achieves the best performance on all datasets.

الرؤية الحاسوبية وتمييز الأنماط

Reverse Attention for Salient Object Detection

125 - Shuhan Chen , Xiuli Tan , Ben Wang 2018

Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low reso lution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).

الرؤية الحاسوبية وتمييز الأنماط