ترغب بنشر مسار تعليمي؟ اضغط هنا

PyRetri: A PyTorch-based Library for Unsupervised Image Retrieval by Deep Convolutional Neural Networks

87   0   0.0 ( 0 )
 نشر من قبل Xiu-Shen Wei
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Despite significant progress of applying deep learning methods to the field of content-based image retrieval, there has not been a software library that covers these methods in a unified manner. In order to fill this gap, we introduce PyRetri, an open source library for deep learning based unsupervised image retrieval. The library encapsulates the retrieval process in several stages and provides functionality that covers various prominent methods for each stage. The idea underlying its design is to provide a unified platform for deep learning based image retrieval research, with high usability and extensibility. To the best of our knowledge, this is the first open-source library for unsupervised image retrieval by deep learning.

قيم البحث

اقرأ أيضاً

Deep neural networks have achieved state-of-the-art results in various vision and/or language tasks. Despite the use of large training datasets, most models are trained by iterating over single input-output pairs, discarding the remaining examples fo r the current prediction. In this work, we actively exploit the training data, using the information from nearest training examples to aid the prediction both during training and testing. Specifically, our approach uses the target of the most similar training example to initialize the memory state of an LSTM model, or to guide attention mechanisms. We apply this approach to image captioning and sentiment analysis, respectively through image and text retrieval. Results confirm the effectiveness of the proposed approach for the two tasks, on the widely used Flickr8 and IMDB datasets. Our code is publicly available at http://github.com/RitaRamo/retrieval-augmentation-nn.
254 - Xin Zhong , Frank Y. Shih 2019
Digital image watermarking is the process of embedding and extracting watermark covertly on a carrier image. Incorporating deep learning networks with image watermarking has attracted increasing attention during recent years. However, existing deep l earning-based watermarking systems cannot achieve robustness, blindness, and automated embedding and extraction simultaneously. In this paper, a fully automated image watermarking system based on deep neural networks is proposed to generalize the image watermarking processes. An unsupervised deep learning structure and a novel loss computation are proposed to achieve high capacity and high robustness without any prior knowledge of possible attacks. Furthermore, a challenging application of watermark extraction from camera-captured images is provided to validate the practicality as well as the robustness of the proposed system. Experimental results show the superiority performance of the proposed system as comparing against several currently available techniques.
Convolutional Neural Networks (CNNs) have recently become a favored technique for image denoising due to its adaptive learning ability, especially with a deep configuration. However, their efficacy is inherently limited owing to their homogenous netw ork formation with the unique use of linear convolution. In this study, we propose a heterogeneous network model which allows greater flexibility for embedding additional non-linearity at the core of the data transformation. To this end, we propose the idea of an operational neuron or Operational Neural Networks (ONN), which enables a flexible non-linear and heterogeneous configuration employing both inter and intra-layer neuronal diversity. Furthermore, we propose a robust operator search strategy inspired by the Hebbian theory, called the Synaptic Plasticity Monitoring (SPM) which can make data-driven choices for non-linearities in any architecture. An extensive set of comparative evaluations of ONNs and CNNs over two severe image denoising problems yield conclusive evidence that ONNs enriched by non-linear operators can achieve a superior denoising performance against CNNs with both equivalent and well-known deep configurations.
Several variants of Convolutional Neural Networks (CNN) have been developed for Magnetic Resonance (MR) image reconstruction. Among them, U-Net has shown to be the baseline architecture for MR image reconstruction. However, sub-sampling is performed by its pooling layers, causing information loss which in turn leads to blur and missing fine details in the reconstructed image. We propose a modification to the U-Net architecture to recover fine structures. The proposed network is a wavelet packet transform based encoder-decoder CNN with residual learning called CNN. The proposed WCNN has discrete wavelet transform instead of pooling and inverse wavelet transform instead of unpooling layers and residual connections. We also propose a deep cascaded framework (DC-WCNN) which consists of cascades of WCNN and k-space data fidelity units to achieve high quality MR reconstruction. Experimental results show that WCNN and DC-WCNN give promising results in terms of evaluation metrics and better recovery of fine details as compared to other methods.
Feature fusion is a commonly used strategy in image retrieval tasks, which aggregates the matching responses of multiple visual features. Feasible sets of features can be either descriptors (SIFT, HSV) for an entire image or the same descriptor for d ifferent local parts (face, body). Ideally, the to-be-fused heterogeneous features are pre-assumed to be discriminative and complementary to each other. However, the effectiveness of different features varies dramatically according to different queries. That is to say, for some queries, a feature may be neither discriminative nor complementary to existing ones, while for other queries, the feature suffices. As a result, it is important to estimate the effectiveness of features in a query-adaptive manner. To this end, this article proposes a new late fusion scheme at the score level. We base our method on the observation that the sorted score curves contain patterns that describe their effectiveness. For example, an L-shaped curve indicates that the feature is discriminative while a gradually descending curve suggests a bad feature. As such, this paper introduces a query-adaptive late fusion pipeline. In the hand-crafted version, it can be an unsupervised approach to tasks like particular object retrieval. In the learning version, it can also be applied to supervised tasks like person recognition and pedestrian retrieval, based on a trainable neural module. Extensive experiments are conducted on two object retrieval datasets and one person recognition dataset. We show that our method is able to highlight the good features and suppress the bad ones, is resilient to distractor features, and achieves very competitive retrieval accuracy compared with the state of the art. In an additional person re-identification dataset, the application scope and limitation of the proposed method are studied.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا