Towards Optimal Discrete Online Hashing with Balanced Similarity

60 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mingbao Lin

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mingbao Lin - Rongrong Ji - Hong Liu

استرجاع المعلومات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

When facing large-scale image datasets, online hashing serves as a promising solution for online retrieval and prediction tasks. It encodes the online streaming data into compact binary codes, and simultaneously updates the hash functions to renew codes of the existing dataset. To this end, the existing methods update hash functions solely based on the new data batch, without investigating the correlation between such new data and the existing dataset. In addition, existing works update the hash functions using a relaxation process in its corresponding approximated continuous space. And it remains as an open problem to directly apply discrete optimizations in online hashing. In this paper, we propose a novel supervised online hashing method, termed Balanced Similarity for Online Discrete Hashing (BSODH), to solve the above problems in a unified framework. BSODH employs a well-designed hashing algorithm to preserve the similarity between the streaming data and the existing dataset via an asymmetric graph regularization. We further identify the data-imbalance problem brought by the constructed asymmetric graph, which restricts the application of discrete optimization in our problem. Therefore, a novel balanced similarity is further proposed, which uses two equilibrium factors to balance the similar and dissimilar weights and eventually enables the usage of discrete optimizations. Extensive experiments conducted on three widely-used benchmarks demonstrate the advantages of the proposed method over the state-of-the-art methods.

قيم البحث

اقرأ أيضاً

Deep Discrete Supervised Hashing

72 - Qing-Yuan Jiang , Xue Cui , Wu-Jun Li 2017

Hashing has been widely used for large-scale search due to its low storage cost and fast query speed. By using supervised information, supervised hashing can significantly outperform unsupervised hashing. Recently, discrete supervised hashing and dee p hashing are two representative progresses in supervised hashing. On one hand, hashing is essentially a discrete optimization problem. Hence, utilizing supervised information to directly guide discrete (binary) coding procedure can avoid sub-optimal solution and improve the accuracy. On the other hand, deep hashing, which integrates deep feature learning and hash-code learning into an end-to-end architecture, can enhance the feedback between feature learning and hash-code learning. The key in discrete supervised hashing is to adopt supervised information to directly guide the discrete coding procedure in hashing. The key in deep hashing is to adopt the supervised information to directly guide the deep feature learning procedure. However, there have not existed works which can use the supervised information to directly guide both discrete coding procedure and deep feature learning procedure in the same framework. In this paper, we propose a novel deep hashing method, called deep discrete supervised hashing (DDSH), to address this problem. DDSH is the first deep hashing method which can utilize supervised information to directly guide both discrete coding procedure and deep feature learning procedure, and thus enhance the feedback between these two important procedures. Experiments on three real datasets show that DDSH can outperform other state-of-the-art baselines, including both discrete hashing and deep hashing baselines, for image retrieval.

استرجاع المعلومات

Discrete Latent Factor Model for Cross-Modal Hashing

123 - Qing-Yuan Jiang , Wu-Jun Li 2017

Due to its storage and retrieval efficiency, cross-modal hashing~(CMH) has been widely used for cross-modal similarity search in multimedia applications. According to the training strategy, existing CMH methods can be mainly divided into two categori es: relaxation-based continuous methods and discrete methods. In general, the training of relaxation-based continuous methods is faster than discrete methods, but the accuracy of relaxation-based continuous methods is not satisfactory. On the contrary, the accuracy of discrete methods is typically better than relaxation-based continuous methods, but the training of discrete methods is time-consuming. In this paper, we propose a novel CMH method, called discrete latent factor model based cross-modal hashing~(DLFH), for cross modal similarity search. DLFH is a discrete method which can directly learn the binary hash codes for CMH. At the same time, the training of DLFH is efficient. Experiments on real datasets show that DLFH can achieve significantly better accuracy than existing methods, and the training time of DLFH is comparable to that of relaxation-based continuous methods which are much faster than existing discrete methods.

استرجاع المعلومات

FRESH: Frechet Similarity with Hashing

70 - Matteo Ceccarello , Anne Driemel , Francesco Silvestri 2018

This paper studies the $r$-range search problem for curves under the continuous Frechet distance: given a dataset $S$ of $n$ polygonal curves and a threshold $r>0$, construct a data structure that, for any query curve $q$, efficiently returns all ent ries in $S$ with distance at most $r$ from $q$. We propose FRESH, an approximate and randomized approach for $r$-range search, that leverages on a locality sensitive hashing scheme for detecting candidate near neighbors of the query curve, and on a subsequent pruning step based on a cascade of curve simplifications. We experimentally compare fresh to exact and deterministic solutions, and we show that high performance can be reached by suitably relaxing precision and recall.

الهندسة الحسابية بنى وهياكل البيانات والخوارزميات

Unsupervised Multi-Index Semantic Hashing

55 - Christian Hansen , Casper Hansen , Jakob Grue Simonsen 2021

Semantic hashing represents documents as compact binary vectors (hash codes) and allows both efficient and effective similarity search in large-scale information retrieval. The state of the art has primarily focused on learning hash codes that improv e similarity search effectiveness, while assuming a brute-force linear scan strategy for searching over all the hash codes, even though much faster alternatives exist. One such alternative is multi-index hashing, an approach that constructs a smaller candidate set to search over, which depending on the distribution of the hash codes can lead to sub-linear search time. In this work, we propose Multi-Index Semantic Hashing (MISH), an unsupervised hashing model that learns hash codes that are both effective and highly efficient by being optimized for multi-index hashing. We derive novel training objectives, which enable to learn hash codes that reduce the candidate sets produced by multi-index hashing, while being end-to-end trainable. In fact, our proposed training objectives are model agnostic, i.e., not tied to how the hash codes are generated specifically in MISH, and are straight-forward to include in existing and future semantic hashing models. We experimentally compare MISH to state-of-the-art semantic hashing baselines in the task of document similarity search. We find that even though multi-index hashing also improves the efficiency of the baselines compared to a linear scan, they are still upwards of 33% slower than MISH, while MISH is still able to obtain state-of-the-art effectiveness.

استرجاع المعلومات التعلم الآلي

Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval

78 - Lei Zhu , Zi Huang , Zhihui Li 2019

Unsupervised hashing can desirably support scalable content-based image retrieval (SCBIR) for its appealing advantages of semantic label independence, memory and search efficiency. However, the learned hash codes are embedded with limited discriminat ive semantics due to the intrinsic limitation of image representation. To address the problem, in this paper, we propose a novel hashing approach, dubbed as emph{Discrete Semantic Transfer Hashing} (DSTH). The key idea is to emph{directly} augment the semantics of discrete image hash codes by exploring auxiliary contextual modalities. To this end, a unified hashing framework is formulated to simultaneously preserve visual similarities of images and perform semantic transfer from contextual modalities. Further, to guarantee direct semantic transfer and avoid information loss, we explicitly impose the discrete constraint, bit--uncorrelation constraint and bit-balance constraint on hash codes. A novel and effective discrete optimization method based on augmented Lagrangian multiplier is developed to iteratively solve the optimization problem. The whole learning process has linear computation complexity and desirable scalability. Experiments on three benchmark datasets demonstrate the superiority of DSTH compared with several state-of-the-art approaches.

استرجاع المعلومات