ﻻ يوجد ملخص باللغة العربية
With the widespread use of machine learning for classification, it becomes increasingly important to be able to use weaker kinds of supervision for tasks in which it is hard to obtain standard labeled data. One such kind of supervision is provided pairwise---in the form of Similar (S) pairs (if two examples belong to the same class) and Dissimilar (D) pairs (if two examples belong to different classes). This kind of supervision is realistic in privacy-sensitive domains. Although this problem has been looked at recently, it is unclear how to learn from such supervision under label noise, which is very common when the supervision is crowd-sourced. In this paper, we close this gap and demonstrate how to learn a classifier from noisy S and D labeled data. We perform a detailed investigation of this problem under two realistic noise models and propose two algorithms to learn from noisy S-D data. We also show important connections between learning from such pairwise supervision data and learning from ordinary class-labeled data. Finally, we perform experiments on synthetic and real world datasets and show our noise-informed algorithms outperform noise-blind baselines in learning from noisy pairwise data.
Active Learning is essential for more label-efficient deep learning. Bayesian Active Learning has focused on BALD, which reduces model parameter uncertainty. However, we show that BALD gets stuck on out-of-distribution or junk data that is not releva
Interactive learning is a process in which a machine learning algorithm is provided with meaningful, well-chosen examples as opposed to randomly chosen examples typical in standard supervised learning. In this paper, we propose a new method for inter
Inherent risk scoring is an important function in anti-money laundering, used for determining the riskiness of an individual during onboarding $textit{before}$ fraudulent transactions occur. It is, however, often fraught with two challenges: (1) inco
Reinforcement learning has achieved great success in various applications. To learn an effective policy for the agent, it usually requires a huge amount of data by interacting with the environment, which could be computational costly and time consumi
A similarity label indicates whether two instances belong to the same class while a class label shows the class of the instance. Without class labels, a multi-class classifier could be learned from similarity-labeled pairwise data by meta classificat