Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Probabilistic Semantic Segmentation Refinement by Monte Carlo Region Growing

120 0 0.0 ( 0 )

Download Cite

Added by Philipe A. Dias

Publication date 2020

fields Informatics Engineering Electronic Engineering

and research's language is English

Authors Philipe A. Dias - Henry Medeiros

Computer Vision and Pattern Recognition Image and Video Processing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Semantic segmentation with fine-grained pixel-level accuracy is a fundamental component of a variety of computer vision applications. However, despite the large improvements provided by recent advances in the architectures of convolutional neural networks, segmentations provided by modern state-of-the-art methods still show limited boundary adherence. We introduce a fully unsupervised post-processing algorithm that exploits Monte Carlo sampling and pixel similarities to propagate high-confidence pixel labels into regions of low-confidence classification. Our algorithm, which we call probabilistic Region Growing Refinement (pRGR), is based on a rigorous mathematical foundation in which clusters are modelled as multivariate normally distributed sets of pixels. Exploiting concepts of Bayesian estimation and variance reduction techniques, pRGR performs multiple refinement iterations at varied receptive fields sizes, while updating cluster statistics to adapt to local image features. Experiments using multiple modern semantic segmentation networks and benchmark datasets demonstrate the effectiveness of our approach for the refinement of segmentation predictions at different levels of coarseness, as well as the suitability of the variance estimates obtained in the Monte Carlo iterations as uncertainty measures that are highly correlated with segmentation accuracy.

rate research

Semantic Segmentation Refinement by Monte Carlo Region Growing of High Confidence Detections

59 - Philipe A. Dias , Henry Medeiros 2018

Despite recent improvements using fully convolutional networks, in general, the segmentation produced by most state-of-the-art semantic segmentation methods does not show satisfactory adherence to the object boundaries. We propose a method to refine the segmentation results generated by such deep learning models. Our method takes as input the confidence scores generated by a pixel-dense segmentation network and re-labels pixels with low confidence levels. The re-labeling approach employs a region growing mechanism that aggregates these pixels to neighboring areas with high confidence scores and similar appearance. In order to correct the labels of pixels that were incorrectly classified with high confidence level by the semantic segmentation algorithm, we generate multiple region growing steps through a Monte Carlo sampling of the seeds of the regions. Our method improves the accuracy of a state-of-the-art fully convolutional semantic segmentation approach on the publicly available COCO and PASCAL datasets, and it shows significantly better results on selected sequences of the finely-annotated DAVIS dataset.

Computer Vision and Pattern Recognition

Cross-view Semantic Segmentation for Sensing Surroundings

94 - Bowen Pan , Jiankai Sun , Ho Yin Tiga Leung 2019

Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In the cross-view semantic segmentation task, the agent is trained to parse the first-view observations into a top-down-view semantic map indicating the spatial location of all the objects at pixel-level. The main issue of this task is that we lack the real-world annotations of top-down-view data. To mitigate this, we train the VPN in 3D graphics environment and utilize the domain adaptation technique to transfer it to handle real-world data. We evaluate our VPN on both synthetic and real-world agents. The experimental results show that our model can effectively make use of the information from different views and multi-modalities to understanding spatial information. Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input. Code and demo videos can be found at url{https://view-parsing-network.github.io}.

Computer Vision and Pattern Recognition Image and Video Processing

MetaSketch: Wireless Semantic Segmentation by Metamaterial Surfaces

82 - Jingzhi Hu , Hongliang Zhang , Kaigui Bian 2021

Semantic segmentation is a process of partitioning an image into multiple segments for recognizing humans and objects, which can be widely applied in scenarios such as healthcare and safety monitoring. To avoid privacy violation, using RF signals instead of an image for human and object recognition has gained increasing attention. However, human and object recognition by using RF signals is usually a passive signal collection and analysis process without changing the radio environment, and the recognition accuracy is restricted significantly by unwanted multi-path fading, and/or the limited number of independent channels between RF transceivers in uncontrollable radio environments. This paper introduces MetaSketch, a novel RF-sensing system that performs semantic recognition and segmentation for humans and objects by making the radio environment reconfigurable. A metamaterial surface is incorporated into MetaSketch and diversifies the information carried by RF signals. Using compressive sensing techniques, MetaSketch reconstructs a point cloud consisting of the reflection coefficients of humans and objects at different spatial points, and recognizes the semantic meaning of the points by using symmetric multilayer perceptron groups. Our evaluation results show that MetaSketch is capable of generating favorable radio environments and extracting exact point clouds, and labeling the semantic meaning of the points with an average error rate of less than 1% in an indoor space.

Signal Processing Image and Video Processing

Virtual Multi-view Fusion for 3D Semantic Segmentation

99 - Abhijit Kundu , Xiaoqi Yin , Alireza Fathi 2020

Semantic segmentation of 3D meshes is an important problem for 3D scene understanding. In this paper we revisit the classic multiview representation of 3D meshes and study several techniques that make them effective for 3D semantic segmentation of meshes. Given a 3D mesh reconstructed from RGBD sensors, our method effectively chooses different virtual views of the 3D mesh and renders multiple 2D channels for training an effective 2D semantic segmentation model. Features from multiple per view predictions are finally fused on 3D mesh vertices to predict mesh semantic segmentation labels. Using the large scale indoor 3D semantic segmentation benchmark of ScanNet, we show that our virtual views enable more effective training of 2D semantic segmentation networks than previous multiview approaches. When the 2D per pixel predictions are aggregated on 3D surfaces, our virtual multiview fusion method is able to achieve significantly better 3D semantic segmentation results compared to all prior multiview approaches and competitive with recent 3D convolution approaches.

Computer Vision and Pattern Recognition Image and Video Processing

Task Decomposition and Synchronization for Semantic Biomedical Image Segmentation

108 - Xuhua Ren , Lichi Zhang , Sahar Ahmad 2019

Semantic segmentation is essentially important to biomedical image analysis. Many recent works mainly focus on integrating the Fully Convolutional Network (FCN) architecture with sophisticated convolution implementation and deep supervision. In this paper, we propose to decompose the single segmentation task into three subsequent sub-tasks, including (1) pixel-wise image segmentation, (2) prediction of the class labels of the objects within the image, and (3) classification of the scene the image belonging to. While these three sub-tasks are trained to optimize their individual loss functions of different perceptual levels, we propose to let them interact by the task-task context ensemble. Moreover, we propose a novel sync-regularization to penalize the deviation between the outputs of the pixel-wise segmentation and the class prediction tasks. These effective regularizations help FCN utilize context information comprehensively and attain accurate semantic segmentation, even though the number of the images for training may be limited in many biomedical applications. We have successfully applied our framework to three diverse 2D/3D medical image datasets, including Robotic Scene Segmentation Challenge 18 (ROBOT18), Brain Tumor Segmentation Challenge 18 (BRATS18), and Retinal Fundus Glaucoma Challenge (REFUGE18). We have achieved top-tier performance in all three challenges.

Computer Vision and Pattern Recognition Image and Video Processing

comments

Fetching comments

Higher Institute for Demographic Studies and Researches

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Probabilistic Semantic Segmentation Refinement by Monte Carlo Region Growing

Ask ChatGPT about the research

No Arabic abstract

Read More