Privacy in Sensor-Driven Human Data Collection: A Guide for Practitioners


Abstract in English

In recent years, the amount of information collected about human beings has increased dramatically. This development has been partially driven by individuals posting and storing data about themselves and friends using online social networks or collecting their data for self-tracking purposes (quantified-self movement). Across the sciences, researchers conduct studies collecting data with an unprecedented resolution and scale. Using computational power combined with mathematical models, such rich datasets can be mined to infer underlying patterns, thereby providing insights into human nature. Much of the data collected is sensitive. It is private in the sense that most individuals would feel uncomfortable sharing their collected personal data publicly. For this reason, the need for solutions to ensure the privacy of the individuals generating data has grown alongside the data collection efforts. Out of all the massive data collection efforts, this paper focuses on efforts directly instrumenting human behavior, and notes that -- in many cases -- the privacy of participants is not sufficiently addressed. For example, study purposes are often not explicit, informed consent is ill-defined, and security and sharing protocols are only partially disclosed. This paper provides a survey of the work related to addressing privacy issues in research studies that collect detailed sensor data on human behavior. Reflections on the key problems and recommendations for future work are included. We hope the overview of the privacy-related practices in massive data collection studies can be used as a frame of reference for practitioners in the field. Although focused on data collection in an academic context, we believe that many of the challenges and solutions we identify are also relevant and useful for other domains where massive data collection takes place, including businesses and governments.

Download