New community

Subscribe to the gold package and get unlimited access to Shamra Academy

MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation

368 0 0.0 ( 0 )

Download Cite

Added by Sam Kumar

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Sam Kumar - David E. Culler - Raluca Ada Popa

Operating Systems Cryptography and Security

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. SC is being increasingly adopted by industry for a variety of applications. A significant obstacle to using SC for practical applications is the memory overhead of the underlying cryptography. We develop MAGE, an execution engine for SC that efficiently runs SC computations that do not fit in memory. We observe that, due to their intended security guarantees, SC schemes are inherently oblivious -- their memory access patterns are independent of the input data. Using this property, MAGE calculates the memory access pattern ahead of time and uses it to produce a memory management plan. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation.

rate research

Hardware-assisted Trusted Memory Disaggregation for Secure Far Memory

219 - Taekyung Heo , Seunghyo Kang , Sanghyeon Lee 2021

Memory disaggregation provides efficient memory utilization across network-connected systems. It allows a node to use part of memory in remote nodes in the same cluster. Recent studies have improved RDMA-based memory disaggregation systems, supporting lower latency and higher bandwidth than the prior generation of disaggregated memory. However, the current disaggregated memory systems manage remote memory only at coarse granularity due to the limitation of the access validation mechanism of RDMA. In such systems, to support fine-grained remote page allocation, the trustworthiness of all participating systems needs to be assumed, and thus a security breach in a node can propagate to the entire cluster. From the security perspective, the memory-providing node must protect its memory from memory-requesting nodes. On the other hand, the memory-requesting node requires the confidentiality and integrity protection of its memory contents even if they are stored in remote nodes. To address the weak isolation support in the current system, this study proposes a novel hardware-assisted memory disaggregation system. Based on the security features of FPGA, the logic in each per-node FPGA board provides a secure memory disaggregation engine. With its own networks, a set of FPGA-based engines form a trusted memory disaggregation system, which is isolated from the privileged software of each participating node. The secure memory disaggregation system allows fine-grained memory management in memory-providing nodes, while the access validation is guaranteed with the hardware-hardened mechanism. In addition, the proposed system hides the memory access patterns observable from remote nodes, supporting obliviousness. Our evaluation with FPGA implementation shows that such fine-grained secure disaggregated memory is feasible with comparable performance to the latest software-based techniques.

Distributed Parallel and Cluster Computing Cryptography and Security

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation

126 - Yang Zhao , Xiaohan Chen , Yue Wang 2020

We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs). We develop a novel algorithm to enforce a specially favorable DNN weight structure, where each layerwise weight matrix can be stored as the product of a small basis matrix and a large sparse coefficient matrix whose non-zero elements are all power-of-2. To our best knowledge, this algorithm is the first formulation that integrates three mainstream model compression ideas: sparsification or pruning, decomposition, and quantization, into one unified framework. The resulting sparse and readily-quantized DNN thus enjoys greatly reduced energy consumption in data movement as well as weight storage. On top of that, we further design a dedicated accelerator to fully utilize the SmartExchange-enforced weights to improve both energy efficiency and latency performance. Extensive experiments show that 1) on the algorithm level, SmartExchange outperforms state-of-the-art compression techniques, including merely sparsification or pruning, decomposition, and quantization, in various ablation studies based on nine DNN models and four datasets; and 2) on the hardware level, the proposed SmartExchange based accelerator can improve the energy efficiency by up to 6.7$times$ and the speedup by up to 19.2$times$ over four state-of-the-art DNN accelerators, when benchmarked on seven DNN models (including four standard DNNs, two compact DNN models, and one segmentation model) and three datasets.

Machine Learning Machine Learning

The Cost of Software-Based Memory Management Without Virtual Memory

99 - Drew Zagieboylo , G. Edward Suh , Andrew C. Myers 2020

Virtual memory has been a standard hardware feature for more than three decades. At the price of increased hardware complexity, it has simplified software and promised strong isolation among colocated processes. In modern computing systems, however, the costs of virtual memory have increased significantly. With large memory workloads, virtualized environments, data center computing, and chips with multiple DMA devices, virtual memory can degrade performance and increase power usage. We therefore explore the implications of building applications and operating systems without relying on hardware support for address translation. Primarily, we investigate the implications of removing the abstraction of large contiguous memory segments. Our experiments show that the overhead to remove this reliance is surprisingly small for real programs. We expect this small overhead to be worth the benefit of reducing the complexity and energy usage of address translation. In fact, in some cases, performance can even improve when address translation is avoided.

Hardware Architecture Programming Languages

Coded Secure Multi-Party Computation for Massive Matrices with Adversarial Nodes

72 - Seyed Reza Hoseini Najarkolaei , Mohammad Ali Maddah-Ali andn Mohammad Reza Aref 2020

In this work, we consider the problem of secure multi-party computation (MPC), consisting of $Gamma$ sources, each has access to a large private matrix, $N$ processing nodes or workers, and one data collector or master. The master is interested in the result of a polynomial function of the input matrices. Each source sends a randomized functions of its matrix, called as its share, to each worker. The workers process their shares in interaction with each other, and send some results to the master such that it can derive the final result. There are several constraints: (1) each worker can store a function of each input matrix, with the size of $frac{1}{m}$ fraction of that input matrix, (2) up to $t$ of the workers, for some integer $t$, are adversary and may collude to gain information about the private inputs or can do malicious actions to make the final result incorrect. The objective is to design an MPC scheme with the minimum number the workers, called the recovery threshold, such that the final result is correct, workers learn no information about the input matrices, and the master learns nothing beyond the final result. In this paper, we propose an MPC scheme that achieves the recovery threshold of $3t+2m-1$ workers, which is order-wise less than the recovery threshold of the conventional methods. The challenge in dealing with this set up is that when nodes interact with each other, the malicious messages that adversarial nodes generate propagate through the system, and can mislead the honest nodes. To deal with this challenge, we design some subroutines that can detect erroneous messages, and correct or drop them.

Information Theory Cryptography and Security Information Theory

Secure Multi-Party Quantum Conference and Xor Computation

87 - Nayana Das , Goutam Paul 2021

Quantum conference is a process of securely exchanging messages between three or more parties, using quantum resources. A Measurement Device Independent Quantum Dialogue (MDI-QD) protocol, which is secure against information leakage, has been proposed (Quantum Information Processing 16.12 (2017): 305) in 2017, is proven to be insecure against intercept-and-resend attack strategy. We first modify this protocol and generalize this MDI-QD to a three-party quantum conference and then to a multi-party quantum conference. We also propose a protocol for quantum multi-party XOR computation. None of these three protocols proposed here use entanglement as a resource and we prove the correctness and security of our proposed protocols.

Quantum Physics Cryptography and Security

comments

Fetching comments

National Institute of Business Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation

Ask ChatGPT about the research

No Arabic abstract

Read More