ﻻ يوجد ملخص باللغة العربية
Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision. Despite the communitys efforts in data collection, there are still few image datasets covering a wide range of scenes and object categories with dense and detailed annotations for scene parsing. In this paper, we introduce and analyze the ADE20K dataset, spanning diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts. A generic network design called Cascade Segmentation Module is then proposed to enable the segmentation networks to parse a scene into stuff, objects, and object parts in a cascade. We evaluate the proposed module integrated within two existing semantic segmentation networks, yielding significant improvements for scene parsing. We further show that the scene parsing networks trained on ADE20K can be applied to a wide variety of scenes and objects.
This work addresses the problem of semantic scene understanding under foggy road conditions. Although marked progress has been made in semantic scene understanding over the recent years, it is mainly concentrated on clear weather outdoor scenes. Exte
Semantic segmentation is an important task in computer vision, from which some important usage scenarios are derived, such as autonomous driving, scene parsing, etc. Due to the emphasis on the task of video semantic segmentation, we participated in t
Memorability measures how easily an image is to be memorized after glancing, which may contribute to designing magazine covers, tourism publicity materials, and so forth. Recent works have shed light on the visual features that make generic images, o
Dashboard cameras capture a tremendous amount of driving scene video each day. These videos are purposefully coupled with vehicle sensing data, such as from the speedometer and inertial sensors, providing an additional sensing modality for free. In t
To better understand scene images in the field of remote sensing, multi-label annotation of scene images is necessary. Moreover, to enhance the performance of deep learning models for dealing with semantic scene understanding tasks, it is vital to tr