Secure Approximation Guarantee for Cryptographically Private Empirical Risk Minimization

135 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ichiro Takeuchi Prof.

تاريخ النشر 2016

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Toshiyuki Takada - Hiroyuki Hanada - Yoshiji Yamada

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Privacy concern has been increasingly important in many machine learning (ML) problems. We study empirical risk minimization (ERM) problems under secure multi-party computation (MPC) frameworks. Main technical tools for MPC have been developed based on cryptography. One of limitations in current cryptographically private ML is that it is computationally intractable to evaluate non-linear functions such as logarithmic functions or exponential functions. Therefore, for a class of ERM problems such as logistic regression in which non-linear function evaluations are required, one can only obtain approximate solutions. In this paper, we introduce a novel cryptographically private tool called secure approximation guarantee (SAG) method. The key property of SAG method is that, given an arbitrary approximate solution, it can provide a non-probabilistic assumption-free bound on the approximation quality under cryptographically secure computation framework. We demonstrate the benefit of the SAG method by applying it to several problems including a practical privacy-preserving data analysis task on genomic and clinical information.

قيم البحث

394 - Kamalika Chaudhuri , Claire Monteleoni , Anand D. Sarwate 2009

Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of c lassifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the $epsilon$-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM classification. Then we propose a new method, objective perturbation, for privacy-preserving machine learning algorithm design. This method entails perturbing the objective function before optimizing over classifiers. If the loss and regularizer satisfy certain convexity and differentiability criteria, we prove theoretical results showing that our algorithms preserve privacy, and provide generalization bounds for linear and nonlinear kernels. We further present a privacy-preserving technique for tuning the parameters in general machine learning algorithms, thereby providing end-to-end privacy guarantees for the training process. We apply these results to produce privacy-preserving analogues of regularized logistic regression and support vector machines. We obtain encouraging results from evaluating their performance on real demographic and benchmark data sets. Our results show that both theoretically and empirically, objective perturbation is superior to the previous state-of-the-art, output perturbation, in managing the inherent tradeoff between privacy and learning performance.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

Private Non-smooth Empirical Risk Minimization and Stochastic Convex Optimization in Subquadratic Steps

163 - Janardhan Kulkarni , Yin Tat Lee , Daogao Liu 2021

We study the differentially private Empirical Risk Minimization (ERM) and Stochastic Convex Optimization (SCO) problems for non-smooth convex functions. We get a (nearly) optimal bound on the excess empirical risk and excess population loss with subq uadratic gradient complexity. More precisely, our differentially private algorithm requires $O(frac{N^{3/2}}{d^{1/8}}+ frac{N^2}{d})$ gradient queries for optimal excess empirical risk, which is achieved with the help of subsampling and smoothing the function via convolution. This is the first subquadratic algorithm for the non-smooth case when $d$ is super constant. As a direct application, using the iterative localization approach of Feldman et al. cite{fkt20}, we achieve the optimal excess population loss for stochastic convex optimization problem, with $O(min{N^{5/4}d^{1/8},frac{ N^{3/2}}{d^{1/8}}})$ gradient queries. Our work makes progress towards resolving a question raised by Bassily et al. cite{bfgt20}, giving first algorithms for private ERM and SCO with subquadratic steps. We note that independently Asi et al. cite{afkt21} gave other algorithms for private ERM and SCO with subquadratic steps.

التعلم الآلي التشفير والأمن بنى وهياكل البيانات والخوارزميات

Invariant Risk Minimization

102 - Martin Arjovsky , Leon Bottou , Ishaan Gulrajani 2019

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that dat a representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.

التعلم الالي الذكاء الاصطناعي التعلم الآلي

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

133 - Kenji Kawaguchi , Haihao Lu 2019

We propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss. In contrast, we develop a computationally efficient method to construct a gradient estimator that is purposely biased toward those observations with higher current losses. On the theory side, we show that the proposed method minimizes a new ordered modification of the empirical average loss, and is guaranteed to converge at a sublinear rate to a global optimum for convex loss and to a critical point for weakly convex (non-convex) loss. Furthermore, we prove a new generalization bound for the proposed algorithm. On the empirical side, the numerical experiments show that our proposed method consistently improves the test errors compared with the standard mini-batch SGD in various models including SVM, logistic regression, and deep learning problems.

التعلم الالي التعلم الآلي التحسين والتحكم

Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness

136 - Lingjuan Lyu , Xuanli He , Yitong Li 2020

It has been demonstrated that hidden representation learned by a deep model can encode private information of the input, hence can be exploited to recover such information with reasonable accuracy. To address this issue, we propose a novel approach c alled Differentially Private Neural Representation (DPNR) to preserve the privacy of the extracted representation from text. DPNR utilises Differential Privacy (DP) to provide a formal privacy guarantee. Further, we show that masking words via dropout can further enhance privacy. To maintain utility of the learned representation, we integrate DP-noisy representation into a robust training process to derive a robust target model, which also helps for model fairness over various demographic variables. Experimental results on benchmark datasets under various parameter settings demonstrate that DPNR largely reduces privacy leakage without significantly sacrificing the main task performance.

التعلم الآلي التشفير والأمن التعلم الالي