No Arabic abstract
Incompatibility of image descriptor and ranking is always neglected in image retrieval. In this paper, manifold learning and Gestalt psychology theory are involved to solve the incompatibility problem. A new holistic descriptor called Perceptual Uniform Descriptor (PUD) based on Gestalt psychology is proposed, which combines color and gradient direction to imitate the human visual uniformity. PUD features in the same class images distributes on one manifold in most cases because PUD improves the visual uniformity of the traditional descriptors. Thus, we use manifold ranking and PUD to realize image retrieval. Experiments were carried out on five benchmark data sets, and the proposed method can greatly improve the accuracy of image retrieval. Our experimental results in the Ukbench and Corel-1K datasets demonstrated that N-S score reached to 3.58 (HSV 3.4) and mAP to 81.77% (ODBTC 77.9%) respectively by utilizing PUD which has only 280 dimension. The results are higher than other holistic image descriptors (even some local ones) and state-of-the-arts retrieval methods.
The re-ranking approach leverages high-confidence retrieved samples to refine retrieval results, which have been widely adopted as a post-processing tool for image retrieval tasks. However, we notice one main flaw of re-ranking, i.e., high computational complexity, which leads to an unaffordable time cost for real-world applications. In this paper, we revisit re-ranking and demonstrate that re-ranking can be reformulated as a high-parallelism Graph Neural Network (GNN) function. In particular, we divide the conventional re-ranking process into two phases, i.e., retrieving high-quality gallery samples and updating features. We argue that the first phase equals building the k-nearest neighbor graph, while the second phase can be viewed as spreading the message within the graph. In practice, GNN only needs to concern vertices with the connected edges. Since the graph is sparse, we can efficiently update the vertex features. On the Market-1501 dataset, we accelerate the re-ranking processing from 89.2s to 9.4ms with one K40m GPU, facilitating the real-time post-processing. Similarly, we observe that our method achieves comparable or even better retrieval results on the other four image retrieval benchmarks, i.e., VeRi-776, Oxford-5k, Paris-6k and University-1652, with limited time cost. Our code is publicly available.
With the rapid growth of web images, hashing has received increasing interests in large scale image retrieval. Research efforts have been devoted to learning compact binary codes that preserve semantic similarity based on labels. However, most of these hashing methods are designed to handle simple binary similarity. The complex multilevel semantic structure of images associated with multiple labels have not yet been well explored. Here we propose a deep semantic ranking based method for learning hash functions that preserve multilevel semantic similarity between multi-label images. In our approach, deep convolutional neural network is incorporated into hash functions to jointly learn feature representations and mappings from them to hash codes, which avoids the limitation of semantic representation power of hand-crafted features. Meanwhile, a ranking list that encodes the multilevel similarity information is employed to guide the learning of such deep hash functions. An effective scheme based on surrogate loss is used to solve the intractable optimization problem of nonsmooth and multivariate ranking measures involved in the learning procedure. Experimental results show the superiority of our proposed approach over several state-of-the-art hashing methods in term of ranking evaluation metrics when tested on multi-label image datasets.
Image retrieval based on deep convolutional features has demonstrated state-of-the-art performance in popular benchmarks. In this paper, we present a unified solution to address deep convolutional feature aggregation and image re-ranking by simulating the dynamics of heat diffusion. A distinctive problem in image retrieval is that repetitive or emph{bursty} features tend to dominate final image representations, resulting in representations less distinguishable. We show that by considering each deep feature as a heat source, our unsupervised aggregation method is able to avoid over-representation of emph{bursty} features. We additionally provide a practical solution for the proposed aggregation method and further show the efficiency of our method in experimental evaluation. Inspired by the aforementioned deep feature aggregation method, we also propose a method to re-rank a number of top ranked images for a given query image by considering the query as the heat source. Finally, we extensively evaluate the proposed approach with pre-trained and fine-tuned deep networks on common public benchmarks and show superior performance compared to previous work.
Image copy detection is challenging and appealing topic in computer vision and signal processing. Recent advancements in multimedia have made distribution of image across the global easy and fast: that leads to many other issues such as forgery and image copy retrieval. Local keypoint descriptors such as SIFT are used to represent the images, and based on those descriptors matching, images are matched and retrieved. Features are quantized so that searching/matching may be made feasible for large databases at the cost of accuracy loss. In this paper, we propose binary feature that is obtained by quantizing the SIFT into binary, and rank list is re-examined to remove the false positives. Experiments on challenging dataset shows the gain in accuracy and time.
Single image dehazing, which aims to recover the clear image solely from an input hazy or foggy image, is a challenging ill-posed problem. Analysing existing approaches, the common key step is to estimate the haze density of each pixel. To this end, various approaches often heuristically designed haze-relevant features. Several recent works also automatically learn the features via directly exploiting Convolutional Neural Networks (CNN). However, it may be insufficient to fully capture the intrinsic attributes of hazy images. To obtain effective features for single image dehazing, this paper presents a novel Ranking Convolutional Neural Network (Ranking-CNN). In Ranking-CNN, a novel ranking layer is proposed to extend the structure of CNN so that the statistical and structural attributes of hazy images can be simultaneously captured. By training Ranking-CNN in a well-designed manner, powerful haze-relevant features can be automatically learned from massive hazy image patches. Based on these features, haze can be effectively removed by using a haze density prediction model trained through the random forest regression. Experimental results show that our approach outperforms several previous dehazing approaches on synthetic and real-world benchmark images. Comprehensive analyses are also conducted to interpret the proposed Ranking-CNN from both the theoretical and experimental aspects.