ﻻ يوجد ملخص باللغة العربية
In recent years, the amount of information collected about human beings has increased dramatically. This development has been partially driven by individuals posting and storing data about themselves and friends using online social networks or collecting their data for self-tracking purposes (quantified-self movement). Across the sciences, researchers conduct studies collecting data with an unprecedented resolution and scale. Using computational power combined with mathematical models, such rich datasets can be mined to infer underlying patterns, thereby providing insights into human nature. Much of the data collected is sensitive. It is private in the sense that most individuals would feel uncomfortable sharing their collected personal data publicly. For this reason, the need for solutions to ensure the privacy of the individuals generating data has grown alongside the data collection efforts. Out of all the massive data collection efforts, this paper focuses on efforts directly instrumenting human behavior, and notes that -- in many cases -- the privacy of participants is not sufficiently addressed. For example, study purposes are often not explicit, informed consent is ill-defined, and security and sharing protocols are only partially disclosed. This paper provides a survey of the work related to addressing privacy issues in research studies that collect detailed sensor data on human behavior. Reflections on the key problems and recommendations for future work are included. We hope the overview of the privacy-related practices in massive data collection studies can be used as a frame of reference for practitioners in the field. Although focused on data collection in an academic context, we believe that many of the challenges and solutions we identify are also relevant and useful for other domains where massive data collection takes place, including businesses and governments.
The increasing generation and collection of personal data has created a complex ecosystem, often collaborative but sometimes combative, around companies and individuals engaging in the use of these data. We propose that the interactions between these
The emerging public awareness and government regulations of data privacy motivate new paradigms of collecting and analyzing data transparent and acceptable to data owners. We present a new concept of privacy and corresponding data formats, mechanisms
Despite decades of research on approximate query processing (AQP), our understanding of sample-based joins has remained limited and, to some extent, even superficial. The common belief in the community is that joining random samples is futile. This b
Large-scale collection of human behavioral data by companies raises serious privacy concerns. We show that behavior captured in the form of application usage data collected from smartphones is highly unique even in very large datasets encompassing mi
This article presents a set of tools for the modeling of a spatial allocation problem in a large geographic market and gives examples of applications. In our settings, the market is described by a network that maps the cost of travel between each pai