Do you want to publish a course? Click here

Dynamic Spectral Residual Superpixels

61   0   0.0 ( 0 )
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

We consider the problem of segmenting an image into superpixels in the context of $k$-means clustering, in which we wish to decompose an image into local, homogeneous regions corresponding to the underlying objects. Our novel approach builds upon the widely used Simple Linear Iterative Clustering (SLIC), and incorporate a measure of objects structure based on the spectral residual of an image. Based on this combination, we propose a modified initialisation scheme and search metric, which helps keeps fine-details. This combination leads to better adherence to object boundaries, while preventing unnecessary segmentation of large, uniform areas, while remaining computationally tractable in comparison to other methods. We demonstrate through numerical and visual experiments that our approach outperforms the state-of-the-art techniques.



rate research

Read More

We present a method for decomposing the 3D scene flow observed from a moving stereo rig into stationary scene elements and dynamic object motion. Our unsupervised learning framework jointly reasons about the camera motion, optical flow, and 3D motion of moving objects. Three cooperating networks predict stereo matching, camera motion, and residual flow, which represents the flow component due to object motion and not from camera motion. Based on rigid projective geometry, the estimated stereo depth is used to guide the camera motion estimation, and the depth and camera motion are used to guide the residual flow estimation. We also explicitly estimate the 3D scene flow of dynamic objects based on the residual flow and scene depth. Experiments on the KITTI dataset demonstrate the effectiveness of our approach and show that our method outperforms other state-of-the-art algorithms on the optical flow and visual odometry tasks.
Superpixel algorithms are a common pre-processing step for computer vision algorithms such as segmentation, object tracking and localization. Many superpixel methods only rely on colors features for segmentation, limiting performance in low-contrast regions and applicability to infrared or medical images where object boundaries have wide appearance variability. We study the inclusion of deep image features in the SLIC superpixel algorithm to exploit higher-level image representations. In addition, we devise a trainable superpixel algorithm, yielding an intermediate domain-specific image representation that can be applied to different tasks. A clustering-based superpixel algorithm is transformed into a pixel-wise classification task and superpixel training data is derived from semantic segmentation datasets. Our results demonstrate that this approach is able to improve superpixel quality consistently.
Computer vision algorithms with pixel-wise labeling tasks, such as semantic segmentation and salient object detection, have gone through a significant accuracy increase with the incorporation of deep learning. Deep segmentation methods slightly modify and fine-tune pre-trained networks that have hundreds of millions of parameters. In this work, we question the need to have such memory demanding networks for the specific task of salient object segmentation. To this end, we propose a way to learn a memory-efficient network from scratch by training it only on salient object detection datasets. Our method encodes images to gridized superpixels that preserve both the object boundaries and the connectivity rules of regular pixels. This representation allows us to use convolutional neural networks that operate on regular grids. By using these encoded images, we train a memory-efficient network using only 0.048% of the number of parameters that other deep salient object detection networks have. Our method shows comparable accuracy with the state-of-the-art deep salient object detection methods and provides a faster and a much more memory-efficient alternative to them. Due to its easy deployment, such a network is preferable for applications in memory limited devices such as mobile phones and IoT devices.
Examining locomotion has improved our basic understanding of motor control and aided in treating motor impairment. Mice and rats are the model system of choice for basic neuroscience studies of human disease. High frame rates are needed to quantify the kinematics of running rodents, due to their high stride frequency. Manual tracking, especially for multiple body landmarks, becomes extremely time-consuming. To overcome these limitations, we proposed the use of superpixels based image segmentation as superpixels utilized both spatial and color information for segmentation. We segmented some parts of body and tested the success of segmentation as a function of color space and SLIC segment size. We used a simple merging function to connect the segmented regions considered as neighbor and having the same intensity value range. In addition, 28 features were extracted, and t-SNE was used to demonstrate how much the methods are capable to differentiate the regions. Finally, we compared the segmented regions to a manually outlined region. The results showed for segmentation, using the RGB image was slightly better compared to the hue channel. For merg- ing and classification, however, the hue representation was better as it captures the relevant color information in a single channel.
Superpixel segmentation has recently seen important progress benefiting from the advances in differentiable deep learning. However, the very high-resolution superpixel segmentation still remains challenging due to the expensive memory and computation cost, making the current advanced superpixel networks fail to process. In this paper, we devise Patch Calibration Networks (PCNet), aiming to efficiently and accurately implement high-resolution superpixel segmentation. PCNet follows the principle of producing high-resolution output from low-resolution input for saving GPU memory and relieving computation cost. To recall the fine details destroyed by the down-sampling operation, we propose a novel Decoupled Patch Calibration (DPC) branch for collaboratively augment the main superpixel generation branch. In particular, DPC takes a local patch from the high-resolution images and dynamically generates a binary mask to impose the network to focus on region boundaries. By sharing the parameters of DPC and main branches, the fine-detailed knowledge learned from high-resolution patches will be transferred to help calibrate the destroyed information. To the best of our knowledge, we make the first attempt to consider the deep-learning-based superpixel generation for high-resolution cases. To facilitate this research, we build evaluation benchmarks from two public datasets and one new constructed one, covering a wide range of diversities from fine-grained human parts to cityscapes. Extensive experiments demonstrate that our PCNet can not only perform favorably against the state-of-the-arts in the quantitative results but also improve the resolution upper bound from 3K to 5K on 1080Ti GPUs.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا