Differentially Private Gaussian Processes

108 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Michael Smith

تاريخ النشر 2016

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Michael Thomas Smith - Max Zwiessele - Neil D. Lawrence

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A major challenge for machine learning is increasing the availability of data while respecting the privacy of individuals. Here we combine the provable privacy guarantees of the differential privacy framework with the flexibility of Gaussian processes (GPs). We propose a method using GPs to provide differentially private (DP) regression. We then improve this method by crafting the DP noise covariance structure to efficiently protect the training data, while minimising the scale of the added noise. We find that this cloaking method achieves the greatest accuracy, while still providing privacy guarantees, and offers practical DP for regression over multi-dimensional inputs. Together these methods provide a starter toolkit for combining differential privacy and GPs.

قيم البحث

اقرأ أيضاً

Differentially Private Dropout

117 - Beyza Ermis , Ali Taylan Cemgil 2017

Large data collections required for the training of neural networks often contain sensitive information such as the medical histories of patients, and the privacy of the training data must be preserved. In this paper, we introduce a dropout technique that provides an elegant Bayesian interpretation to dropout, and show that the intrinsic noise added, with the primary goal of regularization, can be exploited to obtain a degree of differential privacy. The iterative nature of training neural networks presents a challenge for privacy-preserving estimation since multiple iterations increase the amount of noise added. We overcome this by using a relaxed notion of differential privacy, called concentrated differential privacy, which provides tighter estimates on the overall privacy loss. We demonstrate the accuracy of our privacy-preserving dropout algorithm on benchmark datasets.

التعلم الالي التعلم الآلي

Differentially Private Variational Dropout

160 - Beyza Ermis , Ali Taylan Cemgil 2017

Deep neural networks with their large number of parameters are highly flexible learning systems. The high flexibility in such networks brings with some serious problems such as overfitting, and regularization is used to address this problem. A curren tly popular and effective regularization technique for controlling the overfitting is dropout. Often, large data collections required for neural networks contain sensitive information such as the medical histories of patients, and the privacy of the training data should be protected. In this paper, we modify the recently proposed variational dropout technique which provided an elegant Bayesian interpretation to dropout, and show that the intrinsic noise in the variational dropout can be exploited to obtain a degree of differential privacy. The iterative nature of training neural networks presents a challenge for privacy-preserving estimation since multiple iterations increase the amount of noise added. We overcome this by using a relaxed notion of differential privacy, called concentrated differential privacy, which provides tighter estimates on the overall privacy loss. We demonstrate the accuracy of our privacy-preserving variational dropout algorithm on benchmark datasets.

التعلم الالي التعلم الآلي

Improving Differentially Private Models with Active Learning

134 - Zhengli Zhao , Nicolas Papernot , Sameer Singh 2019

Broad adoption of machine learning techniques has increased privacy concerns for models trained on sensitive data such as medical records. Existing techniques for training differentially private (DP) models give rigorous privacy guarantees, but apply ing these techniques to neural networks can severely degrade model performance. This performance reduction is an obstacle to deploying private models in the real world. In this work, we improve the performance of DP models by fine-tuning them through active learning on public data. We introduce two new techniques - DIVERSEPUBLIC and NEARPRIVATE - for doing this fine-tuning in a privacy-aware way. For the MNIST and SVHN datasets, these techniques improve state-of-the-art accuracy for DP models while retaining privacy guarantees.

التعلم الالي التعلم الآلي

Improved Differentially Private Decentralized Source Separation for fMRI Data

435 - Hafiz Imtiaz , Jafar Mohammadi , Rogers Silva 2019

Blind source separation algorithms such as independent component analysis (ICA) are widely used in the analysis of neuroimaging data. In order to leverage larger sample sizes, different data holders/sites may wish to collaboratively learn feature rep resentations. However, such datasets are often privacy-sensitive, precluding centralized analyses that pool the data at a single site. In this work, we propose a differentially private algorithm for performing ICA in a decentralized data setting. Conventional approaches to decentralized differentially private algorithms may introduce too much noise due to the typically small sample sizes at each site. We propose a novel protocol that uses correlated noise to remedy this problem. We show that our algorithm outperforms existing approaches on synthetic and real neuroimaging datasets and demonstrate that it can sometimes reach the same level of utility as the corresponding non-private algorithm. This indicates that it is possible to have meaningful utility while preserving privacy.

التعلم الالي التعلم الآلي معالجة الإشارات

Distributed Differentially-Private Algorithms for Matrix and Tensor Factorization

155 - Hafiz Imtiaz , Anand D. Sarwate 2018

In many signal processing and machine learning applications, datasets containing private information are held at different locations, requiring the development of distributed privacy-preserving algorithms. Tensor and matrix factorizations are key com ponents of many processing pipelines. In the distributed setting, differentially private algorithms suffer because they introduce noise to guarantee privacy. This paper designs new and improved distributed and differentially private algorithms for two popular matrix and tensor factorization methods: principal component analysis (PCA) and orthogonal tensor decomposition (OTD). The new algorithms employ a correlated noise design scheme to alleviate the effects of noise and can achieve the same noise level as the centralized scenario. Experiments on synthetic and real data illustrate the regimes in which the correlated noise allows performance matching with the centralized setting, outperforming previous methods and demonstrating that meaningful utility is possible while guaranteeing differential privacy.

التعلم الالي التعلم الآلي