Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Tag N Train: A Technique to Train Improved Classifiers on Unlabeled Data

207 0 0.0 ( 0 )

Download Cite

Added by Oz Amram

Publication date 2020

fields

and research's language is English

Authors Oz Amram - Cristina Mantilla Suarez

High Energy Physics - Phenomenology High Energy Physics - Experiment

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

There has been substantial progress in applying machine learning techniques to classification problems in collider and jet physics. But as these techniques grow in sophistication, they are becoming more sensitive to subtle features of jets that may not be well modeled in simulation. Therefore, relying on simulations for training will lead to sub-optimal performance in data, but the lack of true class labels makes it difficult to train on real data. To address this challenge we introduce a new approach, called Tag N Train (TNT), that can be applied to unlabeled data that has two distinct sub-objects. The technique uses a weak classifier for one of the objects to tag signal-rich and background-rich samples. These samples are then used to train a stronger classifier for the other object. We demonstrate the power of this method by applying it to a dijet resonance search. By starting with autoencoders trained directly on data as the weak classifiers, we use TNT to train substantially improved classifiers. We show that Tag N Train can be a powerful tool in model-agnostic searches and discuss other potential applications.

rate research

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

71 - Sangdoo Yun , Dongyoon Han , Seong Joon Oh 2019

Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout remove informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it leads to information loss and inefficiency during training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms the state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on the ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gains in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix improves the model robustness against input corruptions and its out-of-distribution detection performances. Source code and pretrained models are available at https://github.com/clovaai/CutMix-PyTorch .

Computer Vision and Pattern Recognition Machine Learning

Train on Validation: Squeezing the Data Lemon

67 - Guy Tennenholtz , Tom Zahavy , Shie Mannor 2018

Model selection on validation data is an essential step in machine learning. While the mixing of data between training and validation is considered taboo, practitioners often violate it to increase performance. Here, we offer a simple, practical method for using the validation set for training, which allows for a continuous, controlled trade-off between performance and overfitting of model selection. We define the notion of on-average-validation-stable algorithms as one in which using small portions of validation data for training does not overfit the model selection process. We then prove that stable algorithms are also validation stable. Finally, we demonstrate our method on the MNIST and CIFAR-10 datasets using stable algorithms as well as state-of-the-art neural networks. Our results show significant increase in test performance with a minor trade-off in bias admitted to the model selection process.

Machine Learning Machine Learning

Optimizing generalization on the train set: a novel gradient-based framework to train parameters and hyperparameters simultaneously

98 - Karim Lounici , Katia Meziani , Benjamin Riu 2020

Generalization is a central problem in Machine Learning. Most prediction methods require careful calibration of hyperparameters carried out on a hold-out textit{validation} dataset to achieve generalization. The main goal of this paper is to present a novel approach based on a new measure of risk that allows us to develop novel fully automatic procedures for generalization. We illustrate the pertinence of this new framework in the regression problem. The main advantages of this new approach are: (i) it can simultaneously train the model and perform regularization in a single run of a gradient-based optimizer on all available data without any previous hyperparameter tuning; (ii) this framework can tackle several additional objectives simultaneously (correlation, sparsity,...) $via$ the introduction of regularization parameters. Noticeably, our approach transforms hyperparameter tuning as well as feature selection (a combinatorial discrete optimization problem) into a continuous optimization problem that is solvable via classical gradient-based methods ; (iii) the computational complexity of our methods is $O(npK)$ where $n,p,K$ denote respectively the number of observations, features and iterations of the gradient descent algorithm. We observe in our experiments a significantly smaller runtime for our methods as compared to benchmark methods for equivalent prediction score. Our procedures are implemented in PyTorch (code is available for replication).

Machine Learning Machine Learning

Kriging in Tensor Train data format

94 - Sergey Dolgov , Alexander Litvinenko , Dishi Liu 2019

Combination of low-tensor rank techniques and the Fast Fourier transform (FFT) based methods had turned out to be prominent in accelerating various statistical operations such as Kriging, computing conditional covariance, geostatistical optimal design, and others. However, the approximation of a full tensor by its low-rank format can be computationally formidable. In this work, we incorporate the robust Tensor Train (TT) approximation of covariance matrices and the efficient TT-Cross algorithm into the FFT-based Kriging. It is shown that here the computational complexity of Kriging is reduced to $mathcal{O}(d r^3 n)$, where $n$ is the mode size of the estimation grid, $d$ is the number of variables (the dimension), and $r$ is the rank of the TT approximation of the covariance matrix. For many popular covariance functions the TT rank $r$ remains stable for increasing $n$ and $d$. The advantages of this approach against those using plain FFT are demonstrated in synthetic and real data examples.

Computation Numerical Analysis Methodology

Quantum Walk to Train a Classical Artificial Neural Network

111 - Luciano S. de Souza , Jonathan H. A. de Carvalho , Tiago A. E. Ferreira 2021

This work proposes a computational procedure that uses a quantum walk in a complete graph to train classical artificial neural networks. The idea is to apply the quantum walk to search the weight set values. However, it is necessary to simulate a quantum machine to execute the quantum walk. In this way, to minimize the computational cost, the methodology employed to train the neural network will adjust the synaptic weights of the output layer, not altering the weights of the hidden layer, inspired in the method of Extreme Learning Machine. The quantum walk algorithm as a search algorithm is quadratically faster than its classic analog. The quantum walk variance is $O(t)$ while the variance of its classic analog is $O(sqrt{t})$, where $t$ is the time or iteration. In addition to computational gain, another advantage of the proposed procedure is to be possible to know textit{a priori} the number of iterations required to obtain the solutions, unlike the classical training algorithms based on gradient descendent.

Quantum Physics

comments

Fetching comments

University of Babylon

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Tag N Train: A Technique to Train Improved Classifiers on Unlabeled Data

Ask ChatGPT about the research

No Arabic abstract

Read More