أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Joshua Robinson

Can contrastive learning avoid shortcut solutions?

109 - Joshua Robinson , Li Sun , Ke Yu 2021

The generalization of representations learned via contrastive learning depends crucially on what features of the data are extracted. However, we observe that the contrastive loss does not always sufficiently guide which features are extracted, a beha vior that can negatively impact the performance on downstream tasks via shortcuts, i.e., by inadvertently suppressing important predictive features. We find that feature extraction is influenced by the difficulty of the so-called instance discrimination task (i.e., the task of discriminating pairs of similar points from pairs of dissimilar ones). Although harder pairs improve the representation of some features, the improvement comes at the cost of suppressing previously well represented features. In response, we propose implicit feature modification (IFM), a method for altering positive and negative samples in order to guide contrastive models towards capturing a wider variety of predictive features. Empirically, we observe that IFM reduces feature suppression, and as a result improves performance on vision and medical imaging tasks. The code is available at: url{https://github.com/joshr17/IFM}.

التعلم الآلي

Contrastive Learning with Hard Negative Samples

121 - Joshua Robinson , Ching-Yao Chuang , Suvrit Sra 2020

How can you sample good negative examples for contrastive learning? We argue that, as with metric learning, contrastive learning of representations benefits from hard negative samples (i.e., points that are difficult to distinguish from an anchor poi nt). The key challenge toward using hard negatives is that contrastive methods must remain unsupervised, making it infeasible to adopt existing negative sampling strategies that use true similarity information. In response, we develop a new family of unsupervised sampling methods for selecting hard negative samples where the user can control the hardness. A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible. The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.

التعلم الآلي التعلم الالي

Debiased Contrastive Learning

113 - Ching-Yao Chuang , Joshua Robinson , Lin Yen-Chen 2020

A prominent technique for self-supervised representation learning has been to contrast semantically similar and dissimilar pairs of samples. Without access to labels, dissimilar (negative) points are typically taken to be randomly sampled datapoints, implicitly accepting that these points may, in reality, actually have the same label. Perhaps unsurprisingly, we observe that sampling negative examples from truly different labels improves performance, in a synthetic setting where labels are available. Motivated by this observation, we develop a debiased contrastive objective that corrects for the sampling of same-label datapoints, even without knowledge of the true labels. Empirically, the proposed objective consistently outperforms the state-of-the-art for representation learning in vision, language, and reinforcement learning benchmarks. Theoretically, we establish generalization bounds for the downstream classification task.

التعلم الآلي التعلم الالي

Flexible Modeling of Diversity with Strongly Log-Concave Distributions

30 - Joshua Robinson , Suvrit Sra , Stefanie Jegelka 2019

Strongly log-concave (SLC) distributions are a rich class of discrete probability distributions over subsets of some ground set. They are strictly more general than strongly Rayleigh (SR) distributions such as the well-known determinantal point proce ss. While SR distributions offer elegant models of diversity, they lack an easy control over how they express diversity. We propose SLC as the right extension of SR that enables easier, more intuitive control over diversity, illustrating this via examples of practical importance. We develop two fundamental tools needed to apply SLC distributions to learning and inference: sampling and mode finding. For sampling we develop an MCMC sampler and give theoretical mixing time bounds. For mode finding, we establish a weak log-submodularity property for SLC functions and derive optimization guarantees for a distorted greedy algorithm.

التعلم الآلي التعلم الالي

Nucleation of Graphene on SiC(0001)

70 - Joshua Robinson , Kathleen Trumbull , Randall Cavalero 2009

This paper has been withdrawn due to the adherance to the double submission policies of a refereed journal. Our apologies.

علم المواد

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد