Do you want to publish a course? Click here

Improved Algorithms for Efficient Active Learning Halfspaces with Massart and Tsybakov noise

123   0   0.0 ( 0 )
 Added by Chicheng Zhang
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

We give a computationally-efficient PAC active learning algorithm for $d$-dimensional homogeneous halfspaces that can tolerate Massart noise (Massart and Nedelec, 2006) and Tsybakov noise (Tsybakov, 2004). Specialized to the $eta$-Massart noise setting, our algorithm achieves an information-theoretically near-optimal label complexity of $tilde{O}left( frac{d}{(1-2eta)^2} mathrm{polylog}(frac1epsilon) right)$ under a wide range of unlabeled data distributions (specifically, the family of structured distributions defined in Diakonikolas et al. (2020)). Under the more challenging Tsybakov noise condition, we identify two subfamilies of noise conditions, under which our efficient algorithm provides label complexity guarantees strictly lower than passive learning algorithms.



rate research

Read More

We analyze the properties of adversarial training for learning adversarially robust halfspaces in the presence of agnostic label noise. Denoting $mathsf{OPT}_{p,r}$ as the best robust classification error achieved by a halfspace that is robust to perturbations of $ell_{p}$ balls of radius $r$, we show that adversarial training on the standard binary cross-entropy loss yields adversarially robust halfspaces up to (robust) classification error $tilde O(sqrt{mathsf{OPT}_{2,r}})$ for $p=2$, and $tilde O(d^{1/4} sqrt{mathsf{OPT}_{infty, r}} + d^{1/2} mathsf{OPT}_{infty,r})$ when $p=infty$. Our results hold for distributions satisfying anti-concentration properties enjoyed by log-concave isotropic distributions among others. We additionally show that if one instead uses a nonconvex sigmoidal loss, adversarial training yields halfspaces with an improved robust classification error of $O(mathsf{OPT}_{2,r})$ for $p=2$, and $O(d^{1/4}mathsf{OPT}_{infty, r})$ when $p=infty$. To the best of our knowledge, this is the first work to show that adversarial training provably yields robust classifiers in the presence of noise.
83 - Jie Shen 2021
We study efficient PAC learning of homogeneous halfspaces in $mathbb{R}^d$ in the presence of malicious noise of Valiant~(1985). This is a challenging noise model and only until recently has near-optimal noise tolerance bound been established under the mild condition that the unlabeled data distribution is isotropic log-concave. However, it remains unsettled how to obtain the optimal sample complexity simultaneously. In this work, we present a new analysis for the algorithm of Awasthi~et~al.~(2017) and show that it essentially achieves the near-optimal sample complexity bound of $tilde{O}(d)$, improving the best known result of $tilde{O}(d^2)$. Our main ingredient is a novel incorporation of a matrix Chernoff-type inequality to bound the spectrum of an empirical covariance matrix for well-behaved distributions, in conjunction with a careful exploration of the localization schemes of Awasthi~et~al.~(2017). We further extend the algorithm and analysis to the more general and stronger nasty noise model of Bshouty~et~al.~(2002), showing that it is still possible to achieve near-optimal noise tolerance and sample complexity in polynomial time.
69 - Jie Shen , Chicheng Zhang 2020
This paper is concerned with computationally efficient learning of homogeneous sparse halfspaces in $mathbb{R}^d$ under noise. Though recent works have established attribute-efficient learning algorithms under various types of label noise (e.g. bounded noise), it remains an open question when and how $s$-sparse halfspaces can be efficiently learned under the challenging malicious noise model, where an adversary may corrupt both the unlabeled examples and the labels. We answer this question in the affirmative by designing a computationally efficient active learning algorithm with near-optimal label complexity of $tilde{O}big({s log^4 frac d epsilon} big)$ and noise tolerance $eta = Omega(epsilon)$, where $epsilon in (0, 1)$ is the target error rate, under the assumption that the distribution over (uncorrupted) unlabeled examples is isotropic log-concave. Our algorithm can be straightforwardly tailored to the passive learning setting, and we show that the sample complexity is $tilde{O}big({frac 1 epsilon s^2 log^5 d} big)$ which also enjoys the attribute efficiency. Our main techniques include attribute-efficient paradigms for instance reweighting and for empirical risk minimization, and a new analysis of uniform concentration for unbounded data -- all of them crucially take the structure of the underlying halfspace into account.
In real-world applications of reinforcement learning (RL), noise from inherent stochasticity of environments is inevitable. However, current policy evaluation algorithms, which plays a key role in many RL algorithms, are either prone to noise or inefficient. To solve this issue, we introduce a novel policy evaluation algorithm, which we call Gap-increasing RetrAce Policy Evaluation (GRAPE). It leverages two recent ideas: (1) gap-increasing value update operators in advantage learning for noise-tolerance and (2) off-policy eligibility trace in Retrace algorithm for efficient learning. We provide detailed theoretical analysis of the new algorithm that shows its efficiency and noise-tolerance inherited from Retrace and advantage learning. Furthermore, our analysis shows that GRAPEs learning is significantly efficient than that of a simple learning-rate-based approach while keeping the same level of noise-tolerance. We applied GRAPE to control problems and obtained experimental results supporting our theoretical analysis.
We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent agnostic PAC model, with a focus on $L_p$ perturbations. We give a computationally efficient learning algorithm and a nearly matching computational hardness result for this problem. An interesting implication of our findings is that the $L_{infty}$ perturbations case is provably computationally harder than the case $2 leq p < infty$.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا