Do you want to publish a course? Click here

Look Ahead ORAM: Obfuscating Addresses in Recommendation Model Training

114   0   0.0 ( 0 )
 Added by Yongqin Wang
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

In the cloud computing era, data privacy is a critical concern. Memory accesses patterns can leak private information. This data leak is particularly challenging for deep learning recommendation models, where data associated with a user is used to train a model. Recommendation models use embedding tables to map categorical data (embedding table indices) to large vector space, which is easier for recommendation systems to learn. Oblivious RAM (ORAM) and its enhancements are proposed solutions to prevent memory access patterns from leaking information. ORAM solutions hide access patterns by fetching multiple data blocks per each demand fetch and then shuffling the location of blocks after each access. In this paper, we propose a new PathORAM architecture designed to protect user input privacy when training recommendation models. Look Ahead ORAM exploits the fact that during training, embedding table indices that are going to be accessed in a future batch are known beforehand. Look Ahead ORAM preprocesses future training samples to identify indices that will co-occur and groups these accesses into a large superblock. Look Ahead ORAM performs the same-path assignment by grouping multiple data blocks into superblocks. Accessing a superblock will require fewer fetched data blocks than accessing all data blocks without grouping them as superblocks. Effectively, Look Ahead ORAM reduces the number of reads/writes per access. Look Ahead ORAM also introduces a fat-tree structure for PathORAM, i.e. a tree with variable bucket size. Look Ahead ORAM achieves 2x speedup compared to PathORAM and reduces the bandwidth requirement by 3.15x while providing the same security as PathORAM.



rate research

Read More

Write-Only Oblivious RAM (WoORAM) protocols provide privacy by encrypting the contents of data and also hiding the pattern of write operations over that data. WoORAMs provide better privacy than plain encryption and better performance than more general ORAM schemes (which hide both writing and reading access patterns), and the write-oblivious setting has been applied to important applications of cloud storage synchronization and encrypted hidden volumes. In this paper, we introduce an entirely new technique for Write-Only ORAM, called DetWoORAM. Unlike previous solutions, DetWoORAM uses a deterministic, sequential writing pattern without the need for any stashing of blocks in local state when writes fail. Our protocol, while conceptually simple, provides substantial improvement over prior solutions, both asymptotically and experimentally. In particular, under typical settings the DetWoORAM writes only 2 blocks (sequentially) to backend memory for each block written to the device, which is optimal. We have implemented our solution using the BUSE (block device in user-space) module and tested DetWoORAM against both an encryption only baseline of dm-crypt and prior, randomized WoORAM solutions, measuring only a 3x-14x slowdown compared to an encryption-only baseline and around 6x-19x speedup compared to prior work.
Performance of investment managers are evaluated in comparison with benchmarks, such as financial indices. Due to the operational constraint that most professional databases do not track the change of constitution of benchmark portfolios, standard tests of performance suffer from the look-ahead benchmark bias, when they use the assets constituting the benchmarks of reference at the end of the testing period, rather than at the beginning of the period. Here, we report that the look-ahead benchmark bias can exhibit a surprisingly large amplitude for portfolios of common stocks (up to 8% annum for the S&P500 taken as the benchmark) -- while most studies have emphasized related survival biases in performance of mutual and hedge funds for which the biases can be expected to be even larger. We use the CRSP database from 1926 to 2006 and analyze the running top 500 US capitalizations to demonstrate that this bias can account for a gross overestimation of performance metrics such as the Sharpe ratio as well as an underestimation of risk, as measured for instance by peak-to-valley drawdowns. We demonstrate the presence of a significant bias in the estimation of the survival and look-ahead biases studied in the literature. A general methodology to test the properties of investment strategies is advanced in terms of random strategies with similar investment constraints.
There are numerous opportunities for adversaries to observe user behavior remotely on the web. Additionally, keystroke biometric algorithms have advanced to the point where user identification and soft biometric trait recognition rates are commercially viable. This presents a privacy concern because masking spatial information, such as IP address, is not sufficient as users become more identifiable by their behavior. In this work, the well-known Chaum mix is generalized to a scenario in which users are separated by both space and time with the goal of preventing an observing adversary from identifying or impersonating the user. The criteria of a behavior obfuscation strategy are defined and two strategies are introduced for obfuscating typing behavior. Experimental results are obtained using publicly available keystroke data for three different types of input, including short fixed-text, long fixed-text, and long free-text. Identification accuracy is reduced by 20% with a 25 ms random keystroke delay not noticeable to the user.
80 - Ang Li , Jiayi Guo , Huanrui Yang 2019
Deep learning has been widely applied in many computer vision applications, with remarkable success. However, running deep learning models on mobile devices is generally challenging due to the limitation of computing resources. A popular alternative is to use cloud services to run deep learning models to process raw data. This, however, imposes privacy risks. Some prior arts proposed sending the features extracted from raw data to the cloud. Unfortunately, these extracted features can still be exploited by attackers to recover raw images and to infer embedded private attributes. In this paper, we propose an adversarial training framework, DeepObfuscator, which prevents the usage of the features for reconstruction of the raw images and inference of private attributes. This is done while retaining useful information for the intended cloud service. DeepObfuscator includes a learnable obfuscator that is designed to hide privacy-related sensitive information from the features by performing our proposed adversarial training algorithm. The proposed algorithm is designed by simulating the game between an attacker who makes efforts to reconstruct raw image and infer private attributes from the extracted features and a defender who aims to protect user privacy. By deploying the trained obfuscator on the smartphone, features can be locally extracted and then sent to the cloud. Our experiments on CelebA and LFW datasets show that the quality of the reconstructed images from the obfuscated features of the raw image is dramatically decreased from 0.9458 to 0.3175 in terms of multi-scale structural similarity. The person in the reconstructed image, hence, becomes hardly to be re-identified. The classification accuracy of the inferred private attributes that can be achieved by the attacker is significantly reduced to a random-guessing level.
We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multithreaded version of BLAS. This approach is also different from the more sophisticated runtime-assisted implementations, which decompose the operation into tasks and identify dependencies via directives and runtime support. Instead, our strategy attains high performance by explicitly embedding a static look-ahead technique into the DMF code, in order to overcome the performance bottleneck of the panel factorization, and realizing the trailing update via a cache-aware multi-threaded implementation of the BLAS. Although the parallel algorithms are specified with a highlevel of abstraction, the actual implementation can be easily derived from them, paving the road to deriving a high performance implementation of a considerable fraction of LAPACK functionality on any multicore platform with an OpenMP-like runtime.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا