New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity

68 0 0.0 ( 0 )

Download Cite

Added by Eliad Tsfadia

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Haim Kaplan - Yishay Mansour - Uri Stemmer

Machine Learning Cryptography and Security Data Structures and Algorithms

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbb{R}^d$ with sample complexity $approx d^{2.5}cdot 2^{log^*|G|}$, which improves the state-of-the-art result of [Beimel et al., COLT 2019] by a $d^2$ factor. The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem: Given a feasible collection of $m$ linear constraints of the form $Axgeq b$, the task is to privately identify a solution $x$ that satisfies most of the constraints. Our algorithm is iterative, where each iteration determines the next coordinate of the constructed solution $x$.

rate research

The Complexity of Adversarially Robust Proper Learning of Halfspaces with Agnostic Noise

78 - Ilias Diakonikolas , Daniel M. Kane , Pasin Manurangsi 2020

We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent agnostic PAC model, with a focus on $L_p$ perturbations. We give a computationally efficient learning algorithm and a nearly matching computational hardness result for this problem. An interesting implication of our findings is that the $L_{infty}$ perturbations case is provably computationally harder than the case $2 leq p < infty$.

Machine Learning Computational Complexity Data Structures and Algorithms

On the Sample Complexity of Reinforcement Learning with Policy Space Generalization

235 - Wenlong Mou , Zheng Wen , Xi Chen 2020

We study the optimal sample complexity in large-scale Reinforcement Learning (RL) problems with policy space generalization, i.e. the agent has a prior knowledge that the optimal policy lies in a known policy space. Existing results show that without a generalization model, the sample complexity of an RL algorithm will inevitably depend on the cardinalities of state space and action space, which are intractably large in many practical problems. To avoid such undesirable dependence on the state and action space sizes, this paper proposes a new notion of eluder dimension for the policy space, which characterizes the intrinsic complexity of policy learning in an arbitrary Markov Decision Process (MDP). Using a simulator oracle, we prove a near-optimal sample complexity upper bound that only depends linearly on the eluder dimension. We further prove a similar regret bound in deterministic systems without the simulator.

Machine Learning Artificial Intelligence Data Structures and Algorithms

Differentially Private Weighted Sampling

104 - Edith Cohen , Ofir Geri , Tamas Sarlos 2020

Common datasets have the form of elements with keys (e.g., transactions and products) and the goal is to perform analytics on the aggregated form of key and frequency pairs. A weighted sample of keys by (a function of) frequency is a highly versatile summary that provides a sparse set of representative keys and supports approximate evaluations of query statistics. We propose private weighted sampling (PWS): A method that ensures element-level differential privacy while retaining, to the extent possible, the utility of a respective non-private weighted sample. PWS maximizes the reporting probabilities of keys and estimation quality of a broad family of statistics. PWS improves over the state of the art also for the well-studied special case of private histograms, when no sampling is performed. We empirically demonstrate significant performance gains compared with prior baselines: 20%-300% increase in key reporting for common Zipfian frequency distributions and accuracy for $times 2$-$ 8$ lower frequencies in estimation tasks. Moreover, PWS is applied as a simple post-processing of a non-private sample, without requiring the original data. This allows for seamless integration with existing implementations of non-private schemes and retaining the efficiency of schemes designed for resource-constrained settings such as massive distributed or streamed data. We believe that due to practicality and performance, PWS may become a method of choice in applications where privacy is desired.

Machine Learning Cryptography and Security Data Structures and Algorithms

Differentially Private Correlation Clustering

357 - Mark Bun , Marek Eliav{s} , Janardhan Kulkarni 2021

Correlation clustering is a widely used technique in unsupervised machine learning. Motivated by applications where individual privacy is a concern, we initiate the study of differentially private correlation clustering. We propose an algorithm that achieves subquadratic additive error compared to the optimal cost. In contrast, straightforward adaptations of existing non-private algorithms all lead to a trivial quadratic error. Finally, we give a lower bound showing that any pure differentially private algorithm for correlation clustering requires additive error of $Omega(n)$.

Machine Learning Cryptography and Security Data Structures and Algorithms

Sample-Optimal PAC Learning of Halfspaces with Malicious Noise

83 - Jie Shen 2021

We study efficient PAC learning of homogeneous halfspaces in $mathbb{R}^d$ in the presence of malicious noise of Valiant~(1985). This is a challenging noise model and only until recently has near-optimal noise tolerance bound been established under the mild condition that the unlabeled data distribution is isotropic log-concave. However, it remains unsettled how to obtain the optimal sample complexity simultaneously. In this work, we present a new analysis for the algorithm of Awasthi~et~al.~(2017) and show that it essentially achieves the near-optimal sample complexity bound of $tilde{O}(d)$, improving the best known result of $tilde{O}(d^2)$. Our main ingredient is a novel incorporation of a matrix Chernoff-type inequality to bound the spectrum of an empirical covariance matrix for well-behaved distributions, in conjunction with a careful exploration of the localization schemes of Awasthi~et~al.~(2017). We further extend the algorithm and analysis to the more general and stronger nasty noise model of Bshouty~et~al.~(2002), showing that it is still possible to achieve near-optimal noise tolerance and sample complexity in polynomial time.

Machine Learning Data Structures and Algorithms Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions