Superpixelizing Binary MRF for Image Labeling Problems

337 0 0.0 ( 0 )

Download Cite

Added by Junyan Wang

Publication date 2015

fields Informatics Engineering

and research's language is English

Authors Junyan Wang - Sai-Kit Yeung

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Superpixels have become prevalent in computer vision. They have been used to achieve satisfactory performance at a significantly smaller computational cost for various tasks. People have also combined superpixels with Markov random field (MRF) models. However, it often takes additional effort to formulate MRF on superpixel-level, and to the best of our knowledge there exists no principled approach to obtain this formulation. In this paper, we show how generic pixel-level binary MRF model can be solved in the superpixel space. As the main contribution of this paper, we show that a superpixel-level MRF can be derived from the pixel-level MRF by substituting the superpixel representation of the pixelwise label into the original pixel-level MRF energy. The resultant superpixel-level MRF energy also remains submodular for a submodular pixel-level MRF. The derived formula hence gives us a handy way to formulate MRF energy in superpixel-level. In the experiments, we demonstrate the efficacy of our approach on several computer vision problems.

rate research

Classification of distributed binary labeling problems

151 - Alkida Balliu , Sebastian Brandt , Yuval Efron 2019

We present a complete classification of the deterministic distributed time complexity for a family of graph problems: binary labeling problems in trees. These are locally checkable problems that can be encoded with an alphabet of size two in the edge labeling formalism. Examples of binary labeling problems include sinkless orientation, sinkless and sourceless orientation, 2-vertex coloring, perfect matching, and the task of coloring edges red and blue such that all nodes are incident to at least one red and at least one blue edge. More generally, we can encode e.g. any cardinality constraints on indegrees and outdegrees. We study the deterministic time complexity of solving a given binary labeling problem in trees, in the usual LOCAL model of distributed computing. We show that the complexity of any such problem is in one of the following classes: $O(1)$, $Theta(log n)$, $Theta(n)$, or unsolvable. In particular, a problem that can be represented in the binary labeling formalism cannot have time complexity $Theta(log^* n)$, and hence we know that e.g. any encoding of maximal matchings has to use at least three labels (which is tight). Furthermore, given the description of any binary labeling problem, we can easily determine in which of the four classes it is and what is an asymptotically optimal algorithm for solving it. Hence the distributed time complexity of binary labeling problems is decidable, not only in principle, but also in practice: there is a simple and efficient algorithm that takes the description of a binary labeling problem and outputs its distributed time complexity.

Distributed Parallel and Cluster Computing Computational Complexity

A Compact Linear Programming Relaxation for Binary Sub-modular MRF

323 - Junyan Wang , Sai-Kit Yeung 2014

We propose a novel compact linear programming (LP) relaxation for binary sub-modular MRF in the context of object segmentation. Our model is obtained by linearizing an $l_1^+$-norm derived from the quadratic programming (QP) form of the MRF energy. The resultant LP model contains significantly fewer variables and constraints compared to the conventional LP relaxation of the MRF energy. In addition, unlike QP which can produce ambiguous labels, our model can be viewed as a quasi-total-variation minimization problem, and it can therefore preserve the discontinuities in the labels. We further establish a relaxation bound between our LP model and the conventional LP model. In the experiments, we demonstrate our method for the task of interactive object segmentation. Our LP model outperforms QP when converting the continuous labels to binary labels using different threshold values on the entire Oxford interactive segmentation dataset. The computational complexity of our LP is of the same order as that of the QP, and it is significantly lower than the conventional LP relaxation.

Computer Vision and Pattern Recognition

SPICE: Semantic Pseudo-labeling for Image Clustering

350 - Chuang Niu , Ge Wang 2021

This paper presents SPICE, a Semantic Pseudo-labeling framework for Image ClustEring. Instead of using indirect loss functions required by the recently proposed methods, SPICE generates pseudo-labels via self-learning and directly uses the pseudo-label-based classification loss to train a deep clustering network. The basic idea of SPICE is to synergize the discrepancy among semantic clusters, the similarity among instance samples, and the semantic consistency of local samples in an embedding space to optimize the clustering network in a semantically-driven paradigm. Specifically, a semantic-similarity-based pseudo-labeling algorithm is first proposed to train a clustering network through unsupervised representation learning. Given the initial clustering results, a local semantic consistency principle is used to select a set of reliably labeled samples, and a semi-pseudo-labeling algorithm is adapted for performance boosting. Extensive experiments demonstrate that SPICE clearly outperforms the state-of-the-art methods on six common benchmark datasets including STL10, Cifar10, Cifar100-20, ImageNet-10, ImageNet-Dog, and Tiny-ImageNet. On average, our SPICE method improves the current best results by about 10% in terms of adjusted rand index, normalized mutual information, and clustering accuracy.

Computer Vision and Pattern Recognition Artificial Intelligence

Image Labeling on a Network: Using Social-Network Metadata for Image Classification

363 - Julian McAuley , Jure Leskovec 2012

Large-scale image retrieval benchmarks invariably consist of images from the Web. Many of these benchmarks are derived from online photo sharing networks, like Flickr, which in addition to hosting images also provide a highly interactive social community. Such communities generate rich metadata that can naturally be harnessed for image classification and retrieval. Here we study four popular benchmark datasets, extending them with social-network metadata, such as the groups to which each image belongs, the comment thread associated with the image, who uploaded it, their location, and their network of friends. Since these types of data are inherently relational, we propose a model that explicitly accounts for the interdependencies between images sharing common properties. We model the task as a binary labeling problem on a network, and use structured learning techniques to learn model parameters. We find that social-network metadata are useful in a variety of classification tasks, in many cases outperforming methods based on image content.

Computer Vision and Pattern Recognition Social and Information Networks Physics and Society

A Neural Markovian Multiresolution Image Labeling Algorithm

50 - John Mashford , Brad Lane , Vic Ciesielski 2018

This paper describes the results of formally evaluating the MCV (Markov concurrent vision) image labeling algorithm which is a (semi-) hierarchical algorithm commencing with a partition made up of single pixel regions and merging regions or subsets of regions using a Markov random field (MRF) image model. It is an example of a general approach to computer vision called concurrent vision in which the operations of image segmentation and image classification are carried out concurrently. While many image labeling algorithms output a single partition, or segmentation, the MCV algorithm outputs a sequence of partitions and this more elaborate structure may provide information that is valuable for higher level vision systems. With certain types of MRF the component of the system for image evaluation can be implemented as a hardwired feed forward neural network. While being applicable to images (i.e. 2D signals), the algorithm is equally applicable to 1D signals (e.g. speech) or 3D signals (e.g. video sequences) (though its performance in such domains remains to be tested). The algorithm is assessed using subjective and objective criteria with very good results.

Computer Vision and Pattern Recognition