Do you want to publish a course? Click here

Privacy-Preserving Batch-based Task Assignment in Spatial Crowdsourcing with Untrusted Server

331   0   0.0 ( 0 )
 Added by Maocheng Li
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

In this paper, we study the privacy-preserving task assignment in spatial crowdsourcing, where the locations of both workers and tasks, prior to their release to the server, are perturbed with Geo-Indistinguishability (a differential privacy notion for location-based systems). Different from the previously studied online setting, where each task is assigned immediately upon arrival, we target the batch-based setting, where the server maximizes the number of successfully assigned tasks after a batch of tasks arrive. To achieve this goal, we propose the k-Switch solution, which first divides the workers into small groups based on the perturbed distance between workers/tasks, and then utilizes Homomorphic Encryption (HE) based secure computation to enhance the task assignment. Furthermore, we expedite HE-based computation by limiting the size of the small groups under k. Extensive experiments demonstrate that, in terms of the number of successfully assigned tasks, the k-Switch solution improves batch-based baselines by 5.9X and the existing online solution by 1.74X, with no privacy leak.



rate research

Read More

Releasing full data records is one of the most challenging problems in data privacy. On the one hand, many of the popular techniques such as data de-identification are problematic because of their dependence on the background knowledge of adversaries. On the other hand, rigorous methods such as the exponential mechanism for differential privacy are often computationally impractical to use for releasing high dimensional data or cannot preserve high utility of original data due to their extensive data perturbation. This paper presents a criterion called plausible deniability that provides a formal privacy guarantee, notably for releasing sensitive datasets: an output record can be released only if a certain amount of input records are indistinguishable, up to a privacy parameter. This notion does not depend on the background knowledge of an adversary. Also, it can efficiently be checked by privacy tests. We present mechanisms to generate synthetic datasets with similar statistical properties to the input data and the same format. We study this technique both theoretically and experimentally. A key theoretical result shows that, with proper randomization, the plausible deniability mechanism generates differentially private synthetic data. We demonstrate the efficiency of this generative technique on a large dataset; it is shown to preserve the utility of original data with respect to various statistical analysis and machine learning measures.
In Near-Neighbor Search (NNS), a new client queries a database (held by a server) for the most similar data (near-neighbors) given a certain similarity metric. The Privacy-Preserving variant (PP-NNS) requires that neither server nor the client shall learn information about the other partys data except what can be inferred from the outcome of NNS. The overwhelming growth in the size of current datasets and the lack of a truly secure server in the online world render the existing solutions impractical; either due to their high computational requirements or non-realistic assumptions which potentially compromise privacy. PP-NNS having query time {it sub-linear} in the size of the database has been suggested as an open research direction by Li et al. (CCSW15). In this paper, we provide the first such algorithm, called Secure Locality Sensitive Indexing (SLSI) which has a sub-linear query time and the ability to handle honest-but-curious parties. At the heart of our proposal lies a secure binary embedding scheme generated from a novel probabilistic transformation over locality sensitive hashing family. We provide information theoretic bound for the privacy guarantees and support our theoretical claims using substantial empirical evidence on real-world datasets.
With the widespread use of LBSs (Location-based Services), synthesizing location traces plays an increasingly important role in analyzing spatial big data while protecting user privacy. In particular, a synthetic trace that preserves a feature specific to a cluster of users (e.g., those who commute by train, those who go shopping) is important for various geo-data analysis tasks and for providing a synthetic location dataset. Although location synthesizers have been widely studied, existing synthesizers do not provide sufficient utility, privacy, or scalability, hence are not practical for large-scale location traces. To overcome this issue, we propose a novel location synthesizer called PPMTF (Privacy-Preserving Multiple Tensor Factorization). We model various statistical features of the original traces by a transition-count tensor and a visit-count tensor. We factorize these two tensors simultaneously via multiple tensor factorization, and train factor matrices via posterior sampling. Then we synthesize traces from reconstructed tensors, and perform a plausible deniability test for a synthetic trace. We comprehensively evaluate PPMTF using two datasets. Our experimental results show that PPMTF preserves various statistical features including cluster-specific features, protects user privacy, and synthesizes large-scale location traces in practical time. PPMTF also significantly outperforms the state-of-the-art methods in terms of utility and scalability at the same level of privacy.
Privacy preservation is a big concern for various sectors. To protect individual user data, one emerging technology is differential privacy. However, it still has limitations for datasets with frequent queries, such as the fast accumulation of privacy cost. To tackle this limitation, this paper explores the integration of a secured decentralised ledger, blockchain. Blockchain will be able to keep track of all noisy responses generated with differential privacy algorithm and allow for certain queries to reuse old responses. In this paper, a demo of a proposed blockchain-based privacy management system is designed as an interactive decentralised web application (DApp). The demo created illustrates that leveraging on blockchain will allow the total privacy cost accumulated to decrease significantly.
Tree-based models are among the most efficient machine learning techniques for data mining nowadays due to their accuracy, interpretability, and simplicity. The recent orthogonal needs for more data and privacy protection call for collaborative privacy-preserving solutions. In this work, we survey the literature on distributed and privacy-preserving training of tree-based models and we systematize its knowledge based on four axes: the learning algorithm, the collaborative model, the protection mechanism, and the threat model. We use this to identify the strengths and limitations of these works and provide for the first time a framework analyzing the information leakage occurring in distributed tree-based model learning.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا