Are My EHRs Private Enough? -Event-level Privacy Protection


Abstract in English

Privacy is a major concern in sharing human subject data to researchers for secondary analyses. A simple binary consent (opt-in or not) may significantly reduce the amount of sharable data, since many patients might only be concerned about a few sensitive medical conditions rather than the entire medical records. We propose event-level privacy protection, and develop a feature ablation method to protect event-level privacy in electronic medical records. Using a list of 13 sensitive diagnoses, we evaluate the feasibility and the efficacy of the proposed method. As feature ablation progresses, the identifiability of a sensitive medical condition decreases with varying speeds on different diseases. We find that these sensitive diagnoses can be divided into 3 categories: (1) 5 diseases have fast declining identifiability (AUC below 0.6 with less than 400 features excluded); (2) 7 diseases with progressively declining identifiability (AUC below 0.7 with between 200 and 700 features excluded); and (3) 1 disease with slowly declining identifiability (AUC above 0.7 with 1000 features excluded). The fact that the majority (12 out of 13) of the sensitive diseases fall into the first two categories suggests the potential of the proposed feature ablation method as a solution for event-level record privacy protection.

Download