ترغب بنشر مسار تعليمي؟ اضغط هنا

Customizing Trusted AI Accelerators for Efficient Privacy-Preserving Machine Learning

146   0   0.0 ( 0 )
 نشر من قبل Peichen Xie
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

The use of trusted hardware has become a promising solution to enable privacy-preserving machine learning. In particular, users can upload their private data and models to a hardware-enforced trusted execution environment (e.g. an enclave in Intel SGX-enabled CPUs) and run machine learning tasks in it with confidentiality and integrity guaranteed. To improve performance, AI accelerators have been widely employed for modern machine learning tasks. However, how to protect privacy on an AI accelerator remains an open question. To address this question, we propose a solution for efficient privacy-preserving machine learning based on an unmodified trusted CPU and a customized trusted AI accelerator. We carefully leverage cryptographic primitives to establish trust and protect the channel between the CPU and the accelerator. As a case study, we demonstrate our solution based on the open-source versatile tensor accelerator. The result of evaluation shows that the proposed solution provides efficient privacy-preserving machine learning at a small design cost and moderate performance overhead.



قيم البحث

اقرأ أيضاً

This work presents Origami, which provides privacy-preserving inference for large deep neural network (DNN) models through a combination of enclave execution, cryptographic blinding, interspersed with accelerator-based computation. Origami partitions the ML model into multiple partitions. The first partition receives the encrypted user input within an SGX enclave. The enclave decrypts the input and then applies cryptographic blinding to the input data and the model parameters. Cryptographic blinding is a technique that adds noise to obfuscate data. Origami sends the obfuscated data for computation to an untrusted GPU/CPU. The blinding and de-blinding factors are kept private by the SGX enclave, thereby preventing any adversary from denoising the data, when the computation is offloaded to a GPU/CPU. The computed output is returned to the enclave, which decodes the computation on noisy data using the unblinding factors privately stored within SGX. This process may be repeated for each DNN layer, as has been done in prior work Slalom. However, the overhead of blinding and unblinding the data is a limiting factor to scalability. Origami relies on the empirical observation that the feature maps after the first several layers can not be used, even by a powerful conditional GAN adversary to reconstruct input. Hence, Origami dynamically switches to executing the rest of the DNN layers directly on an accelerator without needing any further cryptographic blinding intervention to preserve privacy. We empirically demonstrate that using Origami, a conditional GAN adversary, even with an unlimited inference budget, cannot reconstruct the input. We implement and demonstrate the performance gains of Origami using the VGG-16 and VGG-19 models. Compared to running the entire VGG-19 model within SGX, Origami inference improves the performance of private inference from 11x while using Slalom to 15.1x.
With the rising use of Machine Learning (ML) and Deep Learning (DL) in various industries, the medical industry is also not far behind. A very simple yet extremely important use case of ML in this industry is for image classification. This is importa nt for doctors to help them detect certain diseases timely, thereby acting as an aid to reduce chances of human judgement error. However, when using automated systems like these, there is a privacy concern as well. Attackers should not be able to get access to the medical records and images of the patients. It is also required that the model be secure, and that the data that is sent to the model and the predictions that are received both should not be revealed to the model in clear text. In this study, we aim to solve these problems in the context of a medical image classification problem of detection of pneumonia by examining chest x-ray images.
We consider a collaborative learning scenario in which multiple data-owners wish to jointly train a logistic regression model, while keeping their individual datasets private from the other parties. We propose COPML, a fully-decentralized training fr amework that achieves scalability and privacy-protection simultaneously. The key idea of COPML is to securely encode the individual datasets to distribute the computation load effectively across many parties and to perform the training computations as well as the model updates in a distributed manner on the securely encoded data. We provide the privacy analysis of COPML and prove its convergence. Furthermore, we experimentally demonstrate that COPML can achieve significant speedup in training over the benchmark protocols. Our protocol provides strong statistical privacy guarantees against colluding parties (adversaries) with unbounded computational power, while achieving up to $16times$ speedup in the training time against the benchmark protocols.
As the analytic tools become more powerful, and more data are generated on a daily basis, the issue of data privacy arises. This leads to the study of the design of privacy-preserving machine learning algorithms. Given two objectives, namely, utility maximization and privacy-loss minimization, this work is based on two previously non-intersecting regimes -- Compressive Privacy and multi-kernel method. Compressive Privacy is a privacy framework that employs utility-preserving lossy-encoding scheme to protect the privacy of the data, while multi-kernel method is a kernel based machine learning regime that explores the idea of using multiple kernels for building better predictors. The compressive multi-kernel method proposed consists of two stages -- the compression stage and the multi-kernel stage. The compression stage follows the Compressive Privacy paradigm to provide the desired privacy protection. Each kernel matrix is compressed with a lossy projection matrix derived from the Discriminant Component Analysis (DCA). The multi-kernel stage uses the signal-to-noise ratio (SNR) score of each kernel to non-uniformly combine multiple compressive kernels. The proposed method is evaluated on two mobile-sensing datasets -- MHEALTH and HAR -- where activity recognition is defined as utility and person identification is defined as privacy. The results show that the compression regime is successful in privacy preservation as the privacy classification accuracies are almost at the random-guess level in all experiments. On the other hand, the novel SNR-based multi-kernel shows utility classification accuracy improvement upon the state-of-the-art in both datasets. These results indicate a promising direction for research in privacy-preserving machine learning.
Customized hardware accelerators have been developed to provide improved performance and efficiency for DNN inference and training. However, the existing hardware accelerators may not always be suitable for handling various DNN models as their archit ecture paradigms and configuration tradeoffs are highly application-specific. It is important to benchmark the accelerator candidates in the earliest stage to gather comprehensive performance metrics and locate the potential bottlenecks. Further demands also emerge after benchmarking, which require adequate solutions to address the bottlenecks and improve the current designs for targeted workloads. To achieve these goals, in this paper, we leverage an automation tool called DNNExplorer for benchmarking customized DNN hardware accelerators and exploring novel accelerator designs with improved performance and efficiency. Key features include (1) direct support to popular machine learning frameworks for DNN workload analysis and accurate analytical models for fast accelerator benchmarking; (2) a novel accelerator design paradigm with high-dimensional design space support and fine-grained adjustability to overcome the existing design drawbacks; and (3) a design space exploration (DSE) engine to generate optimized accelerators by considering targeted AI workloads and available hardware resources. Results show that accelerators adopting the proposed novel paradigm can deliver up to 4.2X higher throughput (GOP/s) than the state-of-the-art pipeline design in DNNBuilder and up to 2.0X improved efficiency than the recently published generic design in HybridDNN given the same DNN model and resource budgets. With DNNExplorers benchmarking and exploration features, we can be ahead at building and optimizing customized AI accelerators and enable more efficient AI applications.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا