No Arabic abstract
Users can improve the security of remote communications by using Trusted Execution Environments (TEEs) to protect against direct introspection and tampering of sensitive data. This can even be done with applications coded in high-level languages with complex programming stacks such as R, Python, and Ruby. However, this creates a trade-off between programming convenience versus the risk of attacks using microarchitectural side channels. In this paper, we argue that it is possible to address this problem for important applications by instrumenting a complex programming environment (like R) to produce a Data-Oblivious Transcript (DOT) that is explicitly designed to support computation that excludes side channels. Such a transcript is then evaluated on a Trusted Execution Environment (TEE) containing the sensitive data using a small trusted computing base called the Data-Oblivious Virtual Environment (DOVE). To motivate the problem, we demonstrate a number of subtle side-channel vulnerabilities in the R language. We then provide an illustrative design and implementation of DOVE for R, creating the first side-channel resistant R programming stack. We demonstrate that the two-phase architecture provided by DOT generation and DOVE evaluation can provide practical support for complex programming languages with usable performance and high security assurances against side channels.
We present a new oblivious RAM that supports variable-sized storage blocks (vORAM), which is the first ORAM to allow varying block sizes without trivial padding. We also present a new history-independent data structure (a HIRB tree) that can be stored within a vORAM. Together, this construction provides an efficient and practical oblivious data structure (ODS) for a key/value map, and goes further to provide an additional privacy guarantee as compared to prior ODS maps: even upon client compromise, deleted data and the history of old operations remain hidden to the attacker. We implement and measure the performance of our system using Amazon Web Services, and the single-operation time for a realistic database (up to $2^{18}$ entries) is less than 1 second. This represents a 100x speed-up compared to the current best oblivious map data structure (which provides neither secure deletion nor history independence) by Wang et al. (CCS 14).
We present a framework for fully-simulatable $h$-out-of-$n$ oblivious transfer ($OT^{n}_{h}$) with security against non-adaptive malicious adversaries. The framework costs six communication rounds and costs at most $40n$ public-key operations in computational overhead. Compared with the known protocols for fully-simulatable oblivious transfer that works in the plain mode (where there is no trusted common reference string available) and proven to be secure under standard model (where there is no random oracle available), the instantiation based on the decisional Diffie-Hellman assumption of the framework is the most efficient one, no matter seen from communication rounds or computational overhead. Our framework uses three abstract tools, i.e., perfectly binding commitment, perfectly hiding commitment and our new smooth projective hash. This allows a simple and intuitive understanding of its security. We instantiate the new smooth projective hash under the lattice assumption, the decisional Diffie-Hellman assumption, the decisional $N$-th residuosity assumption, the decisional quadratic residuosity assumption. This indeed shows that the folklore that it is technically difficult to instantiate the projective hash framework under the lattice assumption is not true. Whats more, by using this lattice-based hash and lattice-based commitment scheme, we gain a concrete protocol for $OT^{n}_{h}$ which is secure against quantum algorithms.
Data privacy is unarguably of extreme importance. Nonetheless, there exist various daunting challenges to safe-guarding data privacy. These challenges stem from the fact that data owners have little control over their data once it has transgressed their local storage and been managed by third parties whose trustworthiness is questionable at times. Our work seeks to enhance data privacy by constructing a self-expiring data capsule. Sensitive data is encapsulated into a capsule which is associated with an access policy an expiring condition. The former indicates eligibility of functions that can access the data, and the latter dictates when the data should become inaccessible to anyone, including the previously eligible functions. Access to the data capsule, as well as its dismantling once the expiring condition is met, are governed by a committee of independent and mutually distrusting nodes. The pivotal contribution of our work is an integration of hardware primitive, state machine replication and threshold secret sharing in the design of the self-expiring data encapsulation framework. We implement the proposed framework in a system called TEEKAP. Our empirical experiments conducted on a realistic deployment setting with the access control committee spanning across four geographical regions reveal that TEEKAP can process access requests at scale with sub-second latency.
Container technique is gaining increasing attention in recent years and has become an alternative to traditional virtual machines. Some of the primary motivations for the enterprise to adopt the container technology include its convenience to encapsulate and deploy applications, lightweight operations, as well as efficiency and flexibility in resources sharing. However, there still lacks an in-depth and systematic comparison study on how big data applications, such as Spark jobs, perform between a container environment and a virtual machine environment. In this paper, by running various Spark applications with different configurations, we evaluate the two environments from many interesting aspects, such as how convenient the execution environment can be set up, what are makespans of different workloads running in each setup, how efficient the hardware resources, such as CPU and memory, are utilized, and how well each environment can scale. The results show that compared with virtual machines, containers provide a more easy-to-deploy and scalable environment for big data workloads. The research work in this paper can help practitioners and researchers to make more informed decisions on tuning their cloud environment and configuring the big data applications, so as to achieve better performance and higher resources utilization.
Oblivious RAM (ORAM) protocols are powerful techniques that hide a clients data as well as access patterns from untrusted service providers. We present an oblivious cloud storage system, ObliviSync, that specifically targets one of the most widely-used personal cloud storage paradigms: synchronization and backup services, popular examples of which are Dropbox, iCloud Drive, and Google Drive. This setting provides a unique opportunity because the above privacy properties can be achieved with a simpler form of ORAM called write-only ORAM, which allows for dramatically increased efficiency compared to related work. Our solution is asymptotically optimal and practically efficient, with a small constant overhead of approximately 4x compared with non-private file storage, depending only on the total data size and parameters chosen according to the usage rate, and not on the number or size of individual files. Our construction also offers protection against timing-channel attacks, which has not been previously considered in ORAM protocols. We built and evaluated a full implementation of ObliviSync that supports multiple simultaneous read-only clients and a single concurrent read/write client whose edits automatically and seamlessly propagate to the readers. We show that our system functions under high work loads, with realistic file size distributions, and with small additional latency (as compared to a baseline encrypted file system) when paired with Dropbox as the synchronization service.