Hierarchical Image Classification using Entailment Cone Embeddings

157 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ankit Dhall

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ankit Dhall - Anastasia Makarova - Octavian Ganea

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Image classification has been studied extensively, but there has been limited work in using unconventional, external guidance other than traditional image-label pairs for training. We present a set of methods for leveraging information about the semantic hierarchy embedded in class labels. We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier and empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance. Taking a step further in this direction, we model more explicitly the label-label and label-image interactions using order-preserving embeddings governed by both Euclidean and hyperbolic geometries, prevalent in natural language, and tailor them to hierarchical image classification and representation learning. We empirically validate all the models on the hierarchical ETHEC dataset.

قيم البحث

144 - Mohammad Rostami , Soheil Kolouri , Zak Murez 2019

Zero-shot learning (ZSL) is a framework to classify images belonging to unseen classes based on solely semantic information about these unseen classes. In this paper, we propose a new ZSL algorithm using coupled dictionary learning. The core idea is that the visual features and the semantic attributes of an image can share the same sparse representation in an intermediate space. We use images from seen classes and semantic attributes from seen and unseen classes to learn two dictionaries that can represent sparsely the visual and semantic feature vectors of an image. In the ZSL testing stage and in the absence of labeled data, images from unseen classes can be mapped into the attribute space by finding the joint sparse representation using solely the visual data. The image is then classified in the attribute space given semantic descriptions of unseen classes. We also provide an attribute-aware formulation to tackle domain shift and hubness problems in ZSL. Extensive experiments are provided to demonstrate the superior performance of our approach against the state of the art ZSL algorithms on benchmark ZSL datasets.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Yelp Food Identification via Image Feature Extraction and Classification

699 - Fanbo Sun , Zhixiang Gu , Bo Feng 2019

Yelp has been one of the most popular local service search engine in US since 2004. It is powered by crowd-sourced text reviews and photo reviews. Restaurant customers and business owners upload photo images to Yelp, including reviewing or advertisin g either food, drinks, or inside and outside decorations. It is obviously not so effective that labels for food photos rely on human editors, which is an issue should be addressed by innovative machine learning approaches. In this paper, we present a simple but effective approach which can identify up to ten kinds of food via raw photos from the challenge dataset. We use 1) image pre-processing techniques, including filtering and image augmentation, 2) feature extraction via convolutional neural networks (CNN), and 3) three ways of classification algorithms. Then, we illustrate the classification accuracy by tuning parameters for augmentations, CNN, and classification. Our experimental results show this simple but effective approach to identify up to 10 food types from images.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Reconstruction Regularized Deep Metric Learning for Multi-label Image Classification

241 - Changsheng Li , Chong Liu , Lixin Duan 2020

In this paper, we present a novel deep metric learning method to tackle the multi-label image classification problem. In order to better learn the correlations among images features, as well as labels, we attempt to explore a latent space, where imag es and labels are embedded via two unique deep neural networks, respectively. To capture the relationships between image features and labels, we aim to learn a emph{two-way} deep distance metric over the embedding space from two different views, i.e., the distance between one image and its labels is not only smaller than those distances between the image and its labels nearest neighbors, but also smaller than the distances between the labels and other images corresponding to the labels nearest neighbors. Moreover, a reconstruction module for recovering correct labels is incorporated into the whole framework as a regularization term, such that the label embedding space is more representative. Our model can be trained in an end-to-end manner. Experimental results on publicly available image datasets corroborate the efficacy of our method compared with the state-of-the-arts.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Interpretable Mammographic Image Classification using Cased-Based Reasoning and Deep Learning

152 - Alina Jade Barnett , Fides Regina Schwartz , Chaofan Tao 2021

When we deploy machine learning models in high-stakes medical settings, we must ensure these models make accurate predictions that are consistent with known medical science. Inherently interpretable networks address this need by explaining the ration ale behind each decision while maintaining equal or higher accuracy compared to black-box models. In this work, we present a novel interpretable neural network algorithm that uses case-based reasoning for mammography. Designed to aid a radiologist in their decisions, our network presents both a prediction of malignancy and an explanation of that prediction using known medical features. In order to yield helpful explanations, the network is designed to mimic the reasoning processes of a radiologist: our network first detects the clinically relevant semantic features of each image by comparing each new image with a learned set of prototypical image parts from the training images, then uses those clinical features to predict malignancy. Compared to other methods, our model detects clinical features (mass margins) with equal or higher accuracy, provides a more detailed explanation of its prediction, and is better able to differentiate the classification-relevant parts of the image.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

PK-GCN: Prior Knowledge Assisted Image Classification using Graph Convolution Networks

306 - Xueli Xiao , Chunyan Ji , Thosini Bamunu Mudiyanselage 2020

Deep learning has gained great success in various classification tasks. Typically, deep learning models learn underlying features directly from data, and no underlying relationship between classes are included. Similarity between classes can influenc e the performance of classification. In this article, we propose a method that incorporates class similarity knowledge into convolutional neural networks models using a graph convolution layer. We evaluate our method on two benchmark image datasets: MNIST and CIFAR10, and analyze the results on different data and model sizes. Experimental results show that our model can improve classification accuracy, especially when the amount of available data is small.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو