مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Probabilistic Semantic Segmentation Refinement by Monte Carlo Region Growing

120 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Philipe A. Dias

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية هندسة إلكترونية

والبحث باللغة English

تأليف Philipe A. Dias - Henry Medeiros

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Semantic segmentation with fine-grained pixel-level accuracy is a fundamental component of a variety of computer vision applications. However, despite the large improvements provided by recent advances in the architectures of convolutional neural networks, segmentations provided by modern state-of-the-art methods still show limited boundary adherence. We introduce a fully unsupervised post-processing algorithm that exploits Monte Carlo sampling and pixel similarities to propagate high-confidence pixel labels into regions of low-confidence classification. Our algorithm, which we call probabilistic Region Growing Refinement (pRGR), is based on a rigorous mathematical foundation in which clusters are modelled as multivariate normally distributed sets of pixels. Exploiting concepts of Bayesian estimation and variance reduction techniques, pRGR performs multiple refinement iterations at varied receptive fields sizes, while updating cluster statistics to adapt to local image features. Experiments using multiple modern semantic segmentation networks and benchmark datasets demonstrate the effectiveness of our approach for the refinement of segmentation predictions at different levels of coarseness, as well as the suitability of the variance estimates obtained in the Monte Carlo iterations as uncertainty measures that are highly correlated with segmentation accuracy.

قيم البحث

59 - Philipe A. Dias , Henry Medeiros 2018

Despite recent improvements using fully convolutional networks, in general, the segmentation produced by most state-of-the-art semantic segmentation methods does not show satisfactory adherence to the object boundaries. We propose a method to refine the segmentation results generated by such deep learning models. Our method takes as input the confidence scores generated by a pixel-dense segmentation network and re-labels pixels with low confidence levels. The re-labeling approach employs a region growing mechanism that aggregates these pixels to neighboring areas with high confidence scores and similar appearance. In order to correct the labels of pixels that were incorrectly classified with high confidence level by the semantic segmentation algorithm, we generate multiple region growing steps through a Monte Carlo sampling of the seeds of the regions. Our method improves the accuracy of a state-of-the-art fully convolutional semantic segmentation approach on the publicly available COCO and PASCAL datasets, and it shows significantly better results on selected sequences of the finely-annotated DAVIS dataset.

الرؤية الحاسوبية وتمييز الأنماط

Cross-view Semantic Segmentation for Sensing Surroundings

94 - Bowen Pan , Jiankai Sun , Ho Yin Tiga Leung 2019

Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In the cross-view semantic segmentation task, the agent is trained to parse the first-view observations into a top-down-view semantic map indicating the spatial location of all the objects at pixel-level. The main issue of this task is that we lack the real-world annotations of top-down-view data. To mitigate this, we train the VPN in 3D graphics environment and utilize the domain adaptation technique to transfer it to handle real-world data. We evaluate our VPN on both synthetic and real-world agents. The experimental results show that our model can effectively make use of the information from different views and multi-modalities to understanding spatial information. Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input. Code and demo videos can be found at url{https://view-parsing-network.github.io}.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

MetaSketch: Wireless Semantic Segmentation by Metamaterial Surfaces

82 - Jingzhi Hu , Hongliang Zhang , Kaigui Bian 2021

Semantic segmentation is a process of partitioning an image into multiple segments for recognizing humans and objects, which can be widely applied in scenarios such as healthcare and safety monitoring. To avoid privacy violation, using RF signals ins tead of an image for human and object recognition has gained increasing attention. However, human and object recognition by using RF signals is usually a passive signal collection and analysis process without changing the radio environment, and the recognition accuracy is restricted significantly by unwanted multi-path fading, and/or the limited number of independent channels between RF transceivers in uncontrollable radio environments. This paper introduces MetaSketch, a novel RF-sensing system that performs semantic recognition and segmentation for humans and objects by making the radio environment reconfigurable. A metamaterial surface is incorporated into MetaSketch and diversifies the information carried by RF signals. Using compressive sensing techniques, MetaSketch reconstructs a point cloud consisting of the reflection coefficients of humans and objects at different spatial points, and recognizes the semantic meaning of the points by using symmetric multilayer perceptron groups. Our evaluation results show that MetaSketch is capable of generating favorable radio environments and extracting exact point clouds, and labeling the semantic meaning of the points with an average error rate of less than 1% in an indoor space.

معالجة الإشارات معالجة الصور والفيديو

Virtual Multi-view Fusion for 3D Semantic Segmentation

99 - Abhijit Kundu , Xiaoqi Yin , Alireza Fathi 2020

Semantic segmentation of 3D meshes is an important problem for 3D scene understanding. In this paper we revisit the classic multiview representation of 3D meshes and study several techniques that make them effective for 3D semantic segmentation of me shes. Given a 3D mesh reconstructed from RGBD sensors, our method effectively chooses different virtual views of the 3D mesh and renders multiple 2D channels for training an effective 2D semantic segmentation model. Features from multiple per view predictions are finally fused on 3D mesh vertices to predict mesh semantic segmentation labels. Using the large scale indoor 3D semantic segmentation benchmark of ScanNet, we show that our virtual views enable more effective training of 2D semantic segmentation networks than previous multiview approaches. When the 2D per pixel predictions are aggregated on 3D surfaces, our virtual multiview fusion method is able to achieve significantly better 3D semantic segmentation results compared to all prior multiview approaches and competitive with recent 3D convolution approaches.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

Task Decomposition and Synchronization for Semantic Biomedical Image Segmentation

108 - Xuhua Ren , Lichi Zhang , Sahar Ahmad 2019

Semantic segmentation is essentially important to biomedical image analysis. Many recent works mainly focus on integrating the Fully Convolutional Network (FCN) architecture with sophisticated convolution implementation and deep supervision. In this paper, we propose to decompose the single segmentation task into three subsequent sub-tasks, including (1) pixel-wise image segmentation, (2) prediction of the class labels of the objects within the image, and (3) classification of the scene the image belonging to. While these three sub-tasks are trained to optimize their individual loss functions of different perceptual levels, we propose to let them interact by the task-task context ensemble. Moreover, we propose a novel sync-regularization to penalize the deviation between the outputs of the pixel-wise segmentation and the class prediction tasks. These effective regularizations help FCN utilize context information comprehensively and attain accurate semantic segmentation, even though the number of the images for training may be limited in many biomedical applications. We have successfully applied our framework to three diverse 2D/3D medical image datasets, including Robotic Scene Segmentation Challenge 18 (ROBOT18), Brain Tumor Segmentation Challenge 18 (BRATS18), and Retinal Fundus Glaucoma Challenge (REFUGE18). We have achieved top-tier performance in all three challenges.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

معھد الشام العالي للعلوم الشرعية واللغة العربية والدراسات والبحوث الإسلامية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Probabilistic Semantic Segmentation Refinement by Monte Carlo Region Growing

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً