No Arabic abstract
In this paper, we propose GraphSE$^2$, an encrypted graph database for online social network services to address massive data breaches. GraphSE$^2$ preserves the functionality of social search, a key enabler for quality social network services, where social search queries are conducted on a large-scale social graph and meanwhile perform set and computational operations on user-generated contents. To enable efficient privacy-preserving social search, GraphSE$^2$ provides an encrypted structural data model to facilitate parallel and encrypted graph data access. It is also designed to decompose complex social search queries into atomic operations and realise them via interchangeable protocols in a fast and scalable manner. We build GraphSE$^2$ with various queries supported in the Facebook graph search engine and implement a full-fledged prototype. Extensive evaluations on Azure Cloud demonstrate that GraphSE$^2$ is practical for querying a social graph with a million of users.
In this paper, we propose the first secure federated $chi^2$-test protocol Fed-$chi^2$. To minimize both the privacy leakage and the communication cost, we recast $chi^2$-test to the second moment estimation problem and thus can take advantage of stable projection to encode the local information in a short vector. As such encodings can be aggregated with only summation, secure aggregation can be naturally applied to hide the individual updates. We formally prove the security guarantee of Fed-$chi^2$ that the joint distribution is hidden in a subspace with exponential possible distributions. Our evaluation results show that Fed-$chi^2$ achieves negligible accuracy drops with small client-side computation overhead. In several real-world case studies, the performance of Fed-$chi^2$ is comparable to the centralized $chi^2$-test.
Traffic inspection is a fundamental building block of many security solutions today. For example, to prevent the leakage or exfiltration of confidential insider information, as well as to block malicious traffic from entering the network, most enterprises today operate intrusion detection and prevention systems that inspect traffic. However, the state-of-the-art inspection systems do not reflect well the interests of the different involved autonomous roles. For example, employees in an enterprise, or a company outsourcing its network management to a specialized third party, may require that their traffic remains confidential, even from the system administrator. Moreover, the rules used by the intrusion detection system, or more generally the configuration of an online or offline anomaly detection engine, may be provided by a third party, e.g., a security research firm, and can hence constitute a critical business asset which should be kept confidential. Today, it is often believed that accounting for these additional requirements is impossible, as they contradict efficiency and effectiveness. We in this paper explore a novel approach, called Privacy Preserving Inspection (PRI), which provides a solution to this problem, by preserving privacy of traffic inspection and confidentiality of inspection rules and configurations, and e.g., also supports the flexible installation of additional Data Leak Prevention (DLP) rules specific to the company.
An increasing number of businesses are replacing their data storage and computation infrastructure with cloud services. Likewise, there is an increased emphasis on performing analytics based on multiple datasets obtained from different data sources. While ensuring security of data and computation outsourced to a third party cloud is in itself challenging, supporting analytics using data distributed across multiple, independent clouds is even further from trivial. In this paper we present CloudMine, a cloud-based service which allows multiple data owners to perform privacy-preserved computation over the joint data using their clouds as delegates. CloudMine protects data privacy with respect to semi-honest data owners and semi-honest clouds. It furthermore ensures the privacy of the computation outputs from the curious clouds. It allows data owners to reliably detect if their cloud delegates have been lazy when carrying out the delegated computation. CloudMine can run as a centralized service on a single cloud, or as a distributed service over multiple, independent clouds. CloudMine supports a set of basic computations that can be used to construct a variety of highly complex, distributed privacy-preserving data analytics. We demonstrate how a simple instance of CloudMine (secure sum service) is used to implement three classical data mining tasks (classification, association rule mining and clustering) in a cloud environment. We experiment with a prototype of the service, the results of which suggest its practicality for supporting privacy-preserving data analytics as a (multi) cloud-based service.
In Near-Neighbor Search (NNS), a new client queries a database (held by a server) for the most similar data (near-neighbors) given a certain similarity metric. The Privacy-Preserving variant (PP-NNS) requires that neither server nor the client shall learn information about the other partys data except what can be inferred from the outcome of NNS. The overwhelming growth in the size of current datasets and the lack of a truly secure server in the online world render the existing solutions impractical; either due to their high computational requirements or non-realistic assumptions which potentially compromise privacy. PP-NNS having query time {it sub-linear} in the size of the database has been suggested as an open research direction by Li et al. (CCSW15). In this paper, we provide the first such algorithm, called Secure Locality Sensitive Indexing (SLSI) which has a sub-linear query time and the ability to handle honest-but-curious parties. At the heart of our proposal lies a secure binary embedding scheme generated from a novel probabilistic transformation over locality sensitive hashing family. We provide information theoretic bound for the privacy guarantees and support our theoretical claims using substantial empirical evidence on real-world datasets.
Cloud service providers offer a low-cost and convenient solution to host unstructured data. However, cloud services act as third-party solutions and do not provide control of the data to users. This has raised security and privacy concerns for many organizations (users) with sensitive data to utilize cloud-based solutions. User-side encryption can potentially address these concerns by establishing user-centric cloud services and granting data control to the user. Nonetheless, user-side encryption limits the ability to process (e.g., search) encrypted data on the cloud. Accordingly, in this research, we provide a framework that enables processing (in particular, searching) of encrypted multi-organizational (i.e., multi-source) big data without revealing the data to cloud provider. Our framework leverages locality feature of edge computing to offer a user-centric search ability in a real-time manner. In particular, the edge system intelligently predicts the users search pattern and prunes the multi-source big data search space to reduce the search time. The pruning system is based on efficient sampling from the clustered big dataset on the cloud. For each cluster, the pruning system dynamically samples appropriate number of terms based on the users search tendency, so that the cluster is optimally represented. We developed a prototype of a user-centric search system and evaluated it against multiple datasets. Experimental results demonstrate 27% improvement in the pruning quality and search accuracy.