ترغب بنشر مسار تعليمي؟ اضغط هنا

Data auditing is a process to verify whether certain data have been removed from a trained model. A recently proposed method (Liu et al. 20) uses Kolmogorov-Smirnov (KS) distance for such data auditing. However, it fails under certain practical condi tions. In this paper, we propose a new method called Ensembled Membership Auditing (EMA) for auditing data removal to overcome these limitations. We compare both methods using benchmark datasets (MNIST and SVHN) and Chest X-ray datasets with multi-layer perceptrons (MLP) and convolutional neural networks (CNN). Our experiments show that EMA is robust under various conditions, including the failure cases of the previously proposed method. Our code is available at: https://github.com/Hazelsuko07/EMA.
Missing value imputation is a challenging and well-researched topic in data mining. In this paper, we propose IFGAN, a missing value imputation algorithm based on Feature-specific Generative Adversarial Networks (GAN). Our idea is intuitive yet effec tive: a feature-specific generator is trained to impute missing values, while a discriminator is expected to distinguish the imputed values from observed ones. The proposed architecture is capable of handling different data types, data distributions, missing mechanisms, and missing rates. It also improves post-imputation analysis by preserving inter-feature correlations. We empirically show on several real-life datasets that IFGAN outperforms current state-of-the-art algorithm under various missing conditions.
An unsolved challenge in distributed or federated learning is to effectively mitigate privacy risks without slowing down training or reducing accuracy. In this paper, we propose TextHide aiming at addressing this challenge for natural language unders tanding tasks. It requires all participants to add a simple encryption step to prevent an eavesdropping attacker from recovering private text data. Such an encryption step is efficient and only affects the task performance slightly. In addition, TextHide fits well with the popular framework of fine-tuning pre-trained language models (e.g., BERT) for any sentence or sentence-pair task. We evaluate TextHide on the GLUE benchmark, and our experiments show that TextHide can effectively defend attacks on shared gradients or representations and the averaged accuracy reduction is only $1.9%$. We also present an analysis of the security of TextHide using a conjecture about the computational intractability of a mathematical problem. Our code is available at https://github.com/Hazelsuko07/TextHide
How can multiple distributed entities collaboratively train a shared deep net on their private data while preserving privacy? This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines. The encryption is efficient and applying it during training has minor effect on test accuracy. InstaHide encrypts each training image with a one-time secret key which consists of mixing a number of randomly chosen images and applying a random pixel-wise mask. Other contributions of this paper include: (a) Using a large public dataset (e.g. ImageNet) for mixing during its encryption, which improves security. (b) Experimental results to show effectiveness in preserving privacy against known attacks with only minor effects on accuracy. (c) Theoretical analysis showing that successfully attacking privacy requires attackers to solve a difficult computational problem. (d) Demonstrating that use of the pixel-wise mask is important for security, since Mixup alone is shown to be insecure to some some efficient attacks. (e) Release of a challenge dataset https://github.com/Hazelsuko07/InstaHide_Challenge Our code is available at https://github.com/Hazelsuko07/InstaHide
This paper attempts to answer the question whether neural network pruning can be used as a tool to achieve differential privacy without losing much data utility. As a first step towards understanding the relationship between neural network pruning an d differential privacy, this paper proves that pruning a given layer of the neural network is equivalent to adding a certain amount of differentially private noise to its hidden-layer activations. The paper also presents experimental results to show the practical implications of the theoretical finding and the key parameter values in a simple practical setting. These results show that neural network pruning can be a more effective alternative to adding differentially private noise for neural networks.
The next great leap toward improving treatment of cancer with radiation will require the combined use of online adaptive and magnetic resonance guided radiation therapy techniques with automatic X-ray beam orientation selection. Unfortunately, by uni ting these advancements, we are met with a substantial expansion in the required dose information and consequential increase to the overall computational time imposed during radiation treatment planning, which cannot be handled by existing techniques for accelerating Monte Carlo dose calculation. We propose a deep convolutional neural network approach that unlocks new levels of acceleration and accuracy with regards to post-processed Monte Carlo dose results by relying on data-driven learned representations of low-level beamlet dose distributions instead of more limited filter-based denoising techniques that only utilize the information in a single dose input. Our method uses parallel UNET branches acting on three input channels before mixing latent understanding to produce noise-free dose predictions. Our model achieves a normalized mean absolute error of only 0.106% compared with the ground truth dose contrasting the 25.7% error of the under sampled MC dose fed into the network at prediction time. Our models per-beamlet prediction time is ~220ms, including Monte Carlo simulation and network prediction, with substantial additional acceleration expected from batched processing and combination with existing Monte Carlo acceleration techniques. Our method shows promise toward enabling clinical practice of advanced treatment technologies.
Segmentation of pancreas is important for medical image analysis, yet it faces great challenges of class imbalance, background distractions and non-rigid geometrical features. To address these difficulties, we introduce a Deep Q Network(DQN) driven a pproach with deformable U-Net to accurately segment the pancreas by explicitly interacting with contextual information and extract anisotropic features from pancreas. The DQN based model learns a context-adaptive localization policy to produce a visually tightened and precise localization bounding box of the pancreas. Furthermore, deformable U-Net captures geometry-aware information of pancreas by learning geometrically deformable filters for feature extraction. Experiments on NIH dataset validate the effectiveness of the proposed framework in pancreas segmentation.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا