ﻻ يوجد ملخص باللغة العربية
Oblivious inference enables the cloud to provide neural network inference-as-a-service (NN-IaaS), whilst neither disclosing the client data nor revealing the servers model. However, the privacy guarantee under oblivious inference usually comes with a heavy cost of efficiency and accuracy. We propose Popcorn, a concise oblivious inference framework entirely built on the Paillier homomorphic encryption scheme. We design a suite of novel protocols to compute non-linear activation and max-pooling layers. We leverage neural network compression techniques (i.e., neural weights pruning and quantization) to accelerate the inference computation. To implement the Popcorn framework, we only need to replace algebraic operations of existing networks with their corresponding Paillier homomorphic operations, which is extremely friendly for engineering development. We first conduct the performance evaluation and comparison based on the MNIST and CIFAR-10 classification tasks. Compared with existing solutions, Popcorn brings a significant communication overhead deduction, with a moderate runtime increase. Then, we benchmark the performance of oblivious inference on ImageNet. To our best knowledge, this is the first report based on a commercial-level dataset, taking a step towards the deployment to production.
In this work we present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF). It leverages recent advances of various network compression methods and implements some of them, su
We present an efficient finetuning methodology for neural-network filters which are applied as a postprocessing artifact-removal step in video coding pipelines. The fine-tuning is performed at encoder side to adapt the neural network to the specific
Hybrid Privacy-Preserving Neural Network (HPPNN) implementing linear layers by Homomorphic Encryption (HE) and nonlinear layers by Garbled Circuit (GC) is one of the most promising secure solutions to emerging Machine Learning as a Service (MLaaS). U
Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices. Although capable of reducing a reasonable amount of model parameters,
Research has shown that deep neural networks contain significant redundancy, and thus that high classification accuracy can be achieved even when weights and activations are quantized down to binary values. Network binarization on FPGAs greatly incre