A Novel Microdata Privacy Disclosure Risk Measure


Abstract in English

A tremendous amount of individual-level data is generated each day, of use to marketing, decision makers, and machine learning applications. This data often contain private and sensitive information about individuals, which can be disclosed by adversaries. An adversary can recognize the underlying individuals identity for a data record by looking at the values of quasi-identifier attributes, known as identity disclosure, or can uncover sensitive information about an individual through attribute disclosure. In Statistical Disclosure Control, multiple disclosure risk measures have been proposed. These share two drawbacks: they do not consider identity and attribute disclosure concurrently in the risk measure, and they make restrictive assumptions on an adversarys knowledge by assuming certain attributes are quasi-identifiers and there is a clear boundary between quasi-identifiers and sensitive information. In this paper, we present a novel disclosure risk measure that addresses these limitations, by presenting a single combined metric of identity and attribute disclosure risk, and providing flexibility in modeling adversarys knowledge. We have developed an efficient algorithm for computing the proposed risk measure and evaluated the feasibility and performance of our approach on a real-world data set from the domain of social work.

Download