Learning to Noise: Application-Agnostic Data Sharing with Local Differential Privacy


Abstract in English

The collection and sharing of individuals data has become commonplace in many industries. Local differential privacy (LDP) is a rigorous approach to preserving data privacy even from a database administrator, unlike the more standard central differential privacy. To achieve LDP, one traditionally adds noise directly to each data dimension, but for high-dimensional data the level of noise required for sufficient anonymization all but entirely destroys the datas utility. In this paper, we introduce a novel LDP mechanism that leverages representation learning to overcome the prohibitive noise requirements of direct methods. We demonstrate that, rather than simply estimating aggregate statistics of the privatized data as is the norm in LDP applications, our method enables the training of performant machine learning models. Unique applications of our approach include private novel-class classification and the augmentation of clean datasets with additional privatized features. Methods that rely on central differential privacy are not applicable to such tasks. Our approach achieves significant performance gains on these tasks relative to state-of-the-art LDP benchmarks that noise data directly.

Download