k-Mixup Regularization for Deep Learning via Optimal Transport

134 0 0.0 ( 0 )

Download Cite

Added by Kristjan Greenewald

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Kristjan Greenewald - Anming Gu - Mikhail Yurochkin

Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Mixup is a popular regularization technique for training deep neural networks that can improve generalization and increase adversarial robustness. It perturbs input training data in the direction of other randomly-chosen instances in the training set. To better leverage the structure of the data, we extend mixup to emph{$k$-mixup} by perturbing $k$-batches of training points in the direction of other $k$-batches using displacement interpolation, interpolation under the Wasserstein metric. We demonstrate theoretically and in simulations that $k$-mixup preserves cluster and manifold structures, and we extend theory studying efficacy of standard mixup. Our empirical results show that training with $k$-mixup further improves generalization and robustness on benchmark datasets.

rate research

Integral Probability Metric based Regularization for Optimal Transport

261 - Piyushi Manupriya , J. Saketha Nath (IITn Hyderabad 2020

Regularization in Optimal Transport (OT) problems has been shown to critically affect the associated computational and sample complexities. It also has been observed that regularization effectively helps in handling noisy marginals as well as marginals with unequal masses. However, existing works on OT restrict themselves to $phi$-divergences based regularization. In this work, we propose and analyze Integral Probability Metric (IPM) based regularization in OT problems. While it is expected that the well-established advantages of IPMs are inherited by the IPM-regularized OT variants, we interestingly observe that some useful aspects of $phi$-regularization are preserved. For example, we show that the OT formulation, where the marginal constraints are relaxed using IPM-regularization, also lifts the ground metric to that over (perhaps un-normalized) measures. Infact, the lifted metric turns out to be another IPM whose generating set is the intersection of that of the IPM employed for regularization and the set of 1-Lipschitz functions under the ground metric. Also, in the special case where the regularization is squared maximum mean discrepancy based, the proposed OT variant, as well as the corresponding Barycenter formulation, turn out to be those of minimizing a convex quadratic subject to non-negativity/simplex constraints and hence can be solved efficiently. Simulations confirm that the optimal transport plans/maps obtained with IPM-regularization are intrinsically different from those obtained with $phi$-regularization. Empirical results illustrate the efficacy of the proposed IPM-regularized OT formulation. This draft contains the main paper and the Appendices.

Machine Learning Optimization and Control

It Takes Two to Tango: Mixup for Deep Metric Learning

126 - Shashanka Venkataramanan , Bill Psomas , Yannis Avrithis 2021

Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data augmentation methods for classification consider two or more examples at a time. The combination of the two ideas is under-studied. In this work, we aim to bridge this gap and improve representations using mixup, which is a powerful data augmentation approach interpolating two or more examples and corresponding target labels at a time. This task is challenging because, unlike classification, the loss functions used in metric learning are not additive over examples, so the idea of interpolating target labels is not straightforward. To the best of our knowledge, we are the first to investigate mixing examples and target labels for deep metric learning. We develop a generalized formulation that encompasses existing metric learning loss functions and modify it to accommodate for mixup, introducing Metric Mix, or Metrix. We show that mixing inputs, intermediate representations or embeddings along with target labels significantly improves representations and outperforms state-of-the-art metric learning methods on four benchmark datasets.

Machine Learning Computer Vision and Pattern Recognition

Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

198 - Kyle Swanson , Lili Yu , Tao Lei 2020

Selecting input features of top relevance has become a popular method for building self-explaining models. In this work, we extend this selective rationalization approach to text matching, where the goal is to jointly select and align text pieces, such as tokens or sentences, as a justification for the downstream prediction. Our approach employs optimal transport (OT) to find a minimal cost alignment between the inputs. However, directly applying OT often produces dense and therefore uninterpretable alignments. To overcome this limitation, we introduce novel constrained variants of the OT problem that result in highly sparse alignments with controllable sparsity. Our model is end-to-end differentiable using the Sinkhorn algorithm for OT and can be trained without any alignment annotations. We evaluate our model on the StackExchange, MultiNews, e-SNLI, and MultiRC datasets. Our model achieves very sparse rationale selections with high fidelity while preserving prediction accuracy compared to strong attention baseline models.

Machine Learning Computation and Language Machine Learning

Learning Cost Functions for Optimal Transport

287 - Shaojun Ma , Haodong Sun , Xiaojing Ye 2020

Inverse optimal transport (OT) refers to the problem of learning the cost function for OT from observed transport plan or its samples. In this paper, we derive an unconstrained convex optimization formulation of the inverse OT problem, which can be further augmented by any customizable regularization. We provide a comprehensive characterization of the properties of inverse OT, including uniqueness of solutions. We also develop two numerical algorithms, one is a fast matrix scaling method based on the Sinkhorn-Knopp algorithm for discrete OT, and the other one is a learning based algorithm that parameterizes the cost function as a deep neural network for continuous OT. The novel framework proposed in the work avoids repeatedly solving a forward OT in each iteration which has been a thorny computational bottleneck for the bi-level optimization in existing inverse OT approaches. Numerical results demonstrate promising efficiency and accuracy advantages of the proposed algorithms over existing state-of-the-art methods.

Machine Learning Machine Learning

Implicit Regularization in Deep Learning

211 - Behnam Neyshabur 2017

In an attempt to better understand generalization in deep learning, we study several possible explanations. We show that implicit regularization induced by the optimization method is playing a key role in generalization and success of deep learning models. Motivated by this view, we study how different complexity measures can ensure generalization and explain how optimization algorithms can implicitly regularize complexity measures. We empirically investigate the ability of these measures to explain different observed phenomena in deep learning. We further study the invariances in neural networks, suggest complexity measures and optimization algorithms that have similar invariances to those in neural networks and evaluate them on a number of learning tasks.

Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

k-Mixup Regularization for Deep Learning via Optimal Transport

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions