Low-Shot Validation: Active Importance Sampling for Estimating Classifier Performance on Rare Categories

78 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fait Poms

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Fait Poms - Vishnu Sarukkai - Ravi Teja Mullapudi

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs. We propose a statistical validation algorithm that accurately estimates the F-score of binary classifiers for rare categories, where finding relevant examples to evaluate on is particularly challenging. Our key insight is that simultaneous calibration and importance sampling enables accurate estimates even in the low-sample regime (< 300 samples). Critically, we also derive an accurate single-trial estimator of the variance of our method and demonstrate that this estimator is empirically accurate at low sample counts, enabling a practitioner to know how well they can trust a given low-sample estimate. When validating state-of-the-art semi-supervised models on ImageNet and iNaturalist2017, our method achieves the same estimates of model performance with up to 10x fewer labels than competing approaches. In particular, we can estimate model F1 scores with a variance of 0.005 using as few as 100 labels.

قيم البحث

119 - Grant M. Rotskoff , Andrew R. Mitchell , Eric Vanden-Eijnden 2020

Deep neural networks, when optimized with sufficient data, provide accurate representations of high-dimensional functions; in contrast, function approximation techniques that have predominated in scientific computing do not scale well with dimensiona lity. As a result, many high-dimensional sampling and approximation problems once thought intractable are being revisited through the lens of machine learning. While the promise of unparalleled accuracy may suggest a renaissance for applications that require parameterizing representations of complex systems, in many applications gathering sufficient data to develop such a representation remains a significant challenge. Here we introduce an approach that combines rare events sampling techniques with neural network optimization to optimize objective functions that are dominated by rare events. We show that importance sampling reduces the asymptotic variance of the solution to a learning problem, suggesting benefits for generalization. We study our algorithm in the context of learning dynamical transition pathways between two states of a system, a problem with applications in statistical physics and implications in machine learning theory. Our numerical experiments demonstrate that we can successfully learn even with the compounding difficulties of high-dimension and rare data.

تحليل البيانات والإحصاءات والاحتمال الميكانيكا الإحصائية التعلم الالي

Instanton based importance sampling for rare events in stochastic PDEs

119 - Lasse Ebener , Georgios Margazoglou , Jan Friedrich 2018

We present a new method for sampling rare and large fluctuations in a non-equilibrium system governed by a stochastic partial differential equation (SPDE) with additive forcing. To this end, we deploy the so-called instanton formalism that correspond s to a saddle-point approximation of the action in the path integral formulation of the underlying SPDE. The crucial step in our approach is the formulation of an alternative SPDE that incorporates knowledge of the instanton solution such that we are able to constrain the dynamical evolutions around extreme flow configurations only. Finally, a reweighting procedure based on the Girsanov theorem is applied to recover the full distribution function of the original system. The entire procedure is demonstrated on the example of the one-dimensional Burgers equation. Furthermore, we compare our method to conventional direct numerical simulations as well as to Hybrid Monte Carlo methods. It will be shown that the instanton-based sampling method outperforms both approaches and allows for an accurate quantification of the whole probability density function of velocity gradients from the core to the very far tails.

الفيزياء الحسابية ديناميات الفوضوية ديناميات السوائل

Classifier and Exemplar Synthesis for Zero-Shot Learning

76 - Soravit Changpinyo , Wei-Lun Chao , Boqing Gong 2018

Zero-shot learning (ZSL) enables solving a task without the need to see its examples. In this paper, we propose two ZSL frameworks that learn to synthesize parameters for novel unseen classes. First, we propose to cast the problem of ZSL as learning manifold embeddings from graphs composed of object classes, leading to a flexible approach that synthesizes classifiers for the unseen classes. Then, we define an auxiliary task of synthesizing exemplars for the unseen classes to be used as an automatic denoising mechanism for any existing ZSL approaches or as an effective ZSL model by itself. On five visual recognition benchmark datasets, we demonstrate the superior performances of our proposed frameworks in various scenarios of both conventional and generalized ZSL. Finally, we provide valuable insights through a series of empirical analyses, among which are a comparison of semantic representations on the full ImageNet benchmark as well as a comparison of metrics used in generalized ZSL. Our code and data are publicly available at https://github.com/pujols/Zero-shot-learning-journal

الرؤية الحاسوبية وتمييز الأنماط

Transductive Maximum Margin Classifier for Few-Shot Learning

96 - Fei Pan , Chunlei Xu , Jie Guo 2021

Few-shot learning aims to train a classifier that can generalize well when just a small number of labeled samples per class are given. We introduce Transductive Maximum Margin Classifier (TMMC) for few-shot learning. The basic idea of the classical m aximum margin classifier is to solve an optimal prediction function that the corresponding separating hyperplane can correctly divide the training data and the resulting classifier has the largest geometric margin. In few-shot learning scenarios, the training samples are scarce, not enough to find a separating hyperplane with good generalization ability on unseen data. TMMC is constructed using a mixture of the labeled support set and the unlabeled query set in a given task. The unlabeled samples in the query set can adjust the separating hyperplane so that the prediction function is optimal on both the labeled and unlabeled samples. Furthermore, we leverage an efficient and effective quasi-Newton algorithm, the L-BFGS method to optimize TMMC. Experimental results on three standard few-shot learning benchmarks including miniImagenet, tieredImagenet and CUB suggest that our TMMC achieves state-of-the-art accuracies.

الرؤية الحاسوبية وتمييز الأنماط

Estimating standard errors for importance sampling estimators with multiple Markov chains

69 - Vivekananda Roy , Aixin Tan , 2015

The naive importance sampling estimator, based on samples from a single importance density, can be numerically unstable. Instead, we consider generalized importance sampling estimators where samples from more than one probability distribution are com bined. We study this problem in the Markov chain Monte Carlo context, where independent samples are replaced with Markov chain samples. If the chains converge to their respective target distributions at a polynomial rate, then under two finite moment conditions, we show a central limit theorem holds for the generalized estimators. Further, we develop an easy to implement method to calculate valid asymptotic standard errors based on batch means. We also provide a batch means estimator for calculating asymptotically valid standard errors of Geyer(1994) reverse logistic estimator. We illustrate the method using a Bayesian variable selection procedure in linear regression. In particular, the generalized importance sampling estimator is used to perform empirical Bayes variable selection and the batch means estimator is used to obtain standard errors in a high-dimensional setting where current methods are not applicable.

نظرية الإحصاء حساب المنهجية