Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Minimizing the expected value of the asymmetric loss and an inequality of the variance of the loss

57 0 0.0 ( 0 )

Download Cite

Added by Naoya Yamaguchi

Publication date 2019

fields

and research's language is English

Authors Naoya Yamaguchi - Yuka Yamaguchi -

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

For some estimations and predictions, we solve minimization problems with asymmetric loss functions. Usually, we estimate the coefficient of regression for these problems. In this paper, we do not make such the estimation, but rather give a solution by correcting any predictions so that the prediction error follows a general normal distribution. In our method, we can not only minimize the expected value of the asymmetric loss, but also lower the variance of the loss.

rate research

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

80 - Mingyang Yi , Lu Hou , Lifeng Shang 2021

Data augmentation is an effective technique to improve the generalization of deep neural networks. However, previous data augmentation methods usually treat the augmented samples equally without considering their individual impacts on the model. To address this, for the augmented samples from the same training example, we propose to assign different weights to them. We construct the maximal expected loss which is the supremum over any reweighted loss on augmented samples. Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i.e., harder examples). Minimizing this maximal expected loss enables the model to perform well under any reweighting strategy. The proposed method can generally be applied on top of any data augmentation methods. Experiments are conducted on both natural language understanding tasks with token-level data augmentation, and image classification tasks with commonly-used image augmentation techniques like random crop and horizontal flip. Empirical results show that the proposed method improves the generalization performance of the model.

Machine Learning

Predictive density estimation under the Wasserstein loss

62 - Takeru Matsuda , William E. Strawderman 2019

We investigate predictive density estimation under the $L^2$ Wasserstein loss for location families and location-scale families. We show that plug-in densities form a complete class and that the Bayesian predictive density is given by the plug-in density with the posterior mean of the location and scale parameters. We provide Bayesian predictive densities that dominate the best equivariant one in normal models.

Statistics Theory Statistics Theory

A sharp form of the discrete Hardy inequality and the Keller-Pinchover-Pogorzelski inequality

117 - David Krejcirik , Frantisek Stampach 2021

We give a short proof of a recently established Hardy-type inequality due to Keller, Pinchover, and Pogorzelski together with its optimality. Moreover, we identify the remainder term which makes it into an identity.

Spectral Theory Classical Analysis and ODEs

Stability of Gibbs Posteriors from the Wasserstein Loss for Bayesian Full Waveform Inversion

50 - Matthew M. Dunlop , Yunan Yang 2020

Recently, the Wasserstein loss function has been proven to be effective when applied to deterministic full-waveform inversion (FWI) problems. We consider the application of this loss function in Bayesian FWI so that the uncertainty can be captured in the solution. Other loss functions that are commonly used in practice are also considered for comparison. Existence and stability of the resulting Gibbs posteriors are shown on function space under weak assumptions on the prior and model. In particular, the distribution arising from the Wasserstein loss is shown to be quite stable with respect to high-frequency noise in the data. We then illustrate the difference between the resulting distributions numerically, using Laplace approximations to estimate the unknown velocity field and uncertainty associated with the estimates.

Statistics Theory Numerical Analysis Numerical Analysis

Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss

83 - Thomas Effland , Michael Collins 2021

We study learning named entity recognizers in the presence of missing entity annotations. We approach this setting as tagging with latent variables and propose a novel loss, the Expected Entity Ratio, to learn models in the presence of systematically missing tags. We show that our approach is both theoretically sound and empirically useful. Experimentally, we find that it meets or exceeds performance of strong and state-of-the-art baselines across a variety of languages, annotation scenarios, and amounts of labeled data. In particular, we find that it significantly outperforms the previous state-of-the-art methods from Mayhew et al. (2019) and Li et al. (2021) by +12.7 and +2.3 F1 score in a challenging setting with only 1,000 biased annotations, averaged across 7 datasets. We also show that, when combined with our approach, a novel sparse annotation scheme outperforms exhaustive annotation for modest annotation budgets.

Computation and Language

comments

Fetching comments

Al Rasheed International University for Science & Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Minimizing the expected value of the asymmetric loss and an inequality of the variance of the loss

Ask ChatGPT about the research

No Arabic abstract

Read More