ﻻ يوجد ملخص باللغة العربية
In this paper, we study the privacy of online health data. We present a novel online health data De-Anonymization (DA) framework, named De-Health. De-Health consists of two phases: Top-K DA, which identifies a candidate set for each anonymized user, and refined DA, which de-anonymizes an anonymized user to a user in its candidate set. By employing both candidate selection and DA verification schemes, De-Health significantly reduces the DA space by several orders of magnitude while achieving promising DA accuracy. Leveraging two real world online health datasets WebMD (89,393 users, 506K posts) and HealthBoards (388,398 users, 4.7M posts), we validate the efficacy of De-Health. Further, when the training data are insufficient, De-Health can still successfully de-anonymize a large portion of anonymized users. We develop the first analytical framework on the soundness and effectiveness of online health data DA. By analyzing the impact of various data features on the anonymity, we derive the conditions and probabilities for successfully de-anonymizing one user or a group of users in exact DA and Top-K DA. Our analysis is meaningful to both researchers and policy makers in facilitating the development of more effective anonymization techniques and proper privacy polices. We present a linkage attack framework which can link online health/medical information to real world people. Through a proof-of-concept attack, we link 347 out of 2805 WebMD users to real world people, and find the full names, medical/health information, birthdates, phone numbers, and other sensitive information for most of the re-identified users. This clearly illustrates the fragility of the notion of privacy of those who use online health forums.
Underground online forums are platforms that enable trades of illicit services and stolen goods. Carding forums, in particular, are known for being focused on trading financial information. However, little evidence exists about the sellers that are p
Secure and privacy-preserving management of Personal Health Records (PHRs) has proved to be a major challenge in modern healthcare. Current solutions generally do not offer patients a choice in where the data is actually stored and also rely on at le
According to the Center for Disease Control and Prevention, in the United States hundreds of thousands initiate smoking each year, and millions live with smoking-related dis- eases. Many tobacco users discuss their habits and preferences on social me
Motivated by the need for efficient and personalized learning in mobile health, we investigate the problem of online kernel selection for Gaussian Process regression in the multi-task setting. We propose a novel generative process on the kernel compo
Coronavirus disease 2019, or COVID-19 in short, is a zoonosis, i.e., a disease that spreads from animals to humans. Due to its highly epizootic nature, it has compelled the public health experts to deploy smartphone applications to trace its rapid tr