Class Clown: Data Redaction in Machine Unlearning at Enterprise Scale


Abstract in English

Individuals are gaining more control of their personal data through recent data privacy laws such the General Data Protection Regulation and the California Consumer Privacy Act. One aspect of these laws is the ability to request a business to delete private information, the so called right to be forgotten or right to erasure. These laws have serious financial implications for companies and organizations that train large, highly accurate deep neural networks (DNNs) using these valuable consumer data sets. However, a received redaction request poses complex technical challenges on how to comply with the law while fulfilling core business operations. We introduce a DNN model lifecycle maintenance process that establishes how to handle specific data redaction requests and minimize the need to completely retrain the model. Our process is based upon the membership inference attack as a compliance tool for every point in the training set. These attack models quantify the privacy risk of all training data points and form the basis of follow-on data redaction from an accurate deployed model; excision is implemented through incorrect label assignment within incremental model updates.

Download