Quantifying error contributions of computational steps, algorithms and hyperparameter choices in image classification pipelines

45 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Aritra Chowdhury

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Aritra Chowdhury - Malik Magdin-Ismail - Bulent Yener

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of several hyperparameters. Algorithms and hyperparameters must be optimized as a whole to produce the best performance. Typical machine learning pipelines typically consist of complex algorithms in each of the steps. Not only is the selection process combinatorial, but it is also important to interpret and understand the pipelines. We propose a method to quantify the importance of different layers in the pipeline, by computing an error contribution relative to an agnostic choice of algorithms in that layer. We demonstrate our methodology on image classification pipelines. The agnostic methodology quantifies the error contributions from the computational steps, algorithms and hyperparameters in the image classification pipeline. We show that algorithm selection and hyper-parameter optimization methods can be used to quantify the error contribution and that random search is able to quantify the contribution more accurately than Bayesian optimization. This methodology can be used by domain experts to understand machine learning and data analysis pipelines in terms of their individual components, which can help in prioritizing different components of the pipeline.

قيم البحث

126 - Aritra Chowdhury , Malik Magdon-Ismail , Bulent Yener 2019

Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of several hyperpa rameters. Algorithms and hyperparameters must be optimized as a whole to produce the best performance. Typical machine learning pipelines consist of complex algorithms in each of the steps. Not only is the selection process combinatorial, but it is also important to interpret and understand the pipelines. We propose a method to quantify the importance of different components in the pipeline, by computing an error contribution relative to an agnostic choice of computational steps, algorithms and hyperparameters. We also propose a methodology to quantify the propagation of error from individual components of the pipeline with the help of a naive set of benchmark algorithms not involved in the pipeline. We demonstrate our methodology on image classification pipelines. The agnostic and naive methodologies quantify the error contribution and propagation respectively from the computational steps, algorithms and hyperparameters in the image classification pipeline. We show that algorithm selection and hyperparameter optimization methods like grid search, random search and Bayesian optimization can be used to quantify the error contribution and propagation, and that random search is able to quantify them more accurately than Bayesian optimization. This methodology can be used by domain experts to understand machine learning and data analysis pipelines in terms of their individual components, which can help in prioritizing different components of the pipeline.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Yelp Food Identification via Image Feature Extraction and Classification

699 - Fanbo Sun , Zhixiang Gu , Bo Feng 2019

Yelp has been one of the most popular local service search engine in US since 2004. It is powered by crowd-sourced text reviews and photo reviews. Restaurant customers and business owners upload photo images to Yelp, including reviewing or advertisin g either food, drinks, or inside and outside decorations. It is obviously not so effective that labels for food photos rely on human editors, which is an issue should be addressed by innovative machine learning approaches. In this paper, we present a simple but effective approach which can identify up to ten kinds of food via raw photos from the challenge dataset. We use 1) image pre-processing techniques, including filtering and image augmentation, 2) feature extraction via convolutional neural networks (CNN), and 3) three ways of classification algorithms. Then, we illustrate the classification accuracy by tuning parameters for augmentations, CNN, and classification. Our experimental results show this simple but effective approach to identify up to 10 food types from images.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Hierarchical Image Classification using Entailment Cone Embeddings

156 - Ankit Dhall , Anastasia Makarova , Octavian Ganea 2020

Image classification has been studied extensively, but there has been limited work in using unconventional, external guidance other than traditional image-label pairs for training. We present a set of methods for leveraging information about the sema ntic hierarchy embedded in class labels. We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier and empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance. Taking a step further in this direction, we model more explicitly the label-label and label-image interactions using order-preserving embeddings governed by both Euclidean and hyperbolic geometries, prevalent in natural language, and tailor them to hierarchical image classification and representation learning. We empirically validate all the models on the hierarchical ETHEC dataset.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Zero-Shot Image Classification Using Coupled Dictionary Embedding

144 - Mohammad Rostami , Soheil Kolouri , Zak Murez 2019

Zero-shot learning (ZSL) is a framework to classify images belonging to unseen classes based on solely semantic information about these unseen classes. In this paper, we propose a new ZSL algorithm using coupled dictionary learning. The core idea is that the visual features and the semantic attributes of an image can share the same sparse representation in an intermediate space. We use images from seen classes and semantic attributes from seen and unseen classes to learn two dictionaries that can represent sparsely the visual and semantic feature vectors of an image. In the ZSL testing stage and in the absence of labeled data, images from unseen classes can be mapped into the attribute space by finding the joint sparse representation using solely the visual data. The image is then classified in the attribute space given semantic descriptions of unseen classes. We also provide an attribute-aware formulation to tackle domain shift and hubness problems in ZSL. Extensive experiments are provided to demonstrate the superior performance of our approach against the state of the art ZSL algorithms on benchmark ZSL datasets.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Semantic Redundancies in Image-Classification Datasets: The 10% You Dont Need

76 - Vighnesh Birodkar , Hossein Mobahi , Samy Bengio 2019

Large datasets have been crucial to the success of deep learning models in the recent years, which keep performing better as they are trained with more labelled data. While there have been sustained efforts to make these models more data-efficient, t he potential benefit of understanding the data itself, is largely untapped. Specifically, focusing on object recognition tasks, we wonder if for common benchmark datasets we can do better than random subsets of the data and find a subset that can generalize on par with the full dataset when trained on. To our knowledge, this is the first result that can find notable redundancies in CIFAR-10 and ImageNet datasets (at least 10%). Interestingly, we observe semantic correlations between required and redundant images. We hope that our findings can motivate further research into identifying additional redundancies and exploiting them for more efficient training or data-collection.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي