Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

CAPE: Context-Aware Private Embeddings for Private Language Learning

217 0 0.0 ( 0 )

Download Cite

Added by Richard Plant

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Richard Plant - Dimitra Gkatzia - Valerio Giuffrida

Computation and Language Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Deep learning-based language models have achieved state-of-the-art results in a number of applications including sentiment analysis, topic labelling, intent classification and others. Obtaining text representations or embeddings using these models presents the possibility of encoding personally identifiable information learned from language and context cues that may present a risk to reputation or privacy. To ameliorate these issues, we propose Context-Aware Private Embeddings (CAPE), a novel approach which preserves privacy during training of embeddings. To maintain the privacy of text representations, CAPE applies calibrated noise through differential privacy, preserving the encoded semantic links while obscuring sensitive information. In addition, CAPE employs an adversarial training regime that obscures identified private variables. Experimental results demonstrate that the proposed approach reduces private information leakage better than either single intervention.

rate research

Shared-Private Bilingual Word Embeddings for Neural Machine Translation

134 - Xuebo Liu , Derek F. Wong , Yang Liu 2019

Word embedding is central to neural machine translation (NMT), which has attracted intensive research interest in recent years. In NMT, the source embedding plays the role of the entrance while the target embedding acts as the terminal. These layers occupy most of the model parameters for representation learning. Furthermore, they indirectly interface via a soft-attention mechanism, which makes them comparatively isolated. In this paper, we propose shared-private bilingual word embeddings, which give a closer relationship between the source and target embeddings, and which also reduce the number of model parameters. For similar source and target words, their embeddings tend to share a part of the features and they cooperatively learn these common representation units. Experiments on 5 language pairs belonging to 6 different language families and written in 5 different alphabets demonstrate that the proposed model provides a significant performance boost over the strong baselines with dramatically fewer model parameters.

Computation and Language Artificial Intelligence

Private learning implies quantum stability

82 - Srinivasan Arunachalam , Yihui Quek , John Smolin 2021

Learning an unknown $n$-qubit quantum state $rho$ is a fundamental challenge in quantum computing. Information-theoretically, it is known that tomography requires exponential in $n$ many copies of $rho$ to estimate it up to trace distance. Motivated by computational learning theory, Aaronson et al. introduced many (weaker) learning models: the PAC model of learning states (Proceedings of Royal Society A07), shadow tomography (STOC18) for learning shadows of a state, a model that also requires learners to be differentially private (STOC19) and the online model of learning states (NeurIPS18). In these models it was shown that an unknown state can be learned approximately using linear-in-$n$ many copies of rho. But is there any relationship between these models? In this paper we prove a sequence of (information-theoretic) implications from differentially-private PAC learning, to communication complexity, to online learning and then to quantum stability. Our main result generalizes the recent work of Bun, Livni and Moran (Journal of the ACM21) who showed that finite Littlestone dimension (of Boolean-valued concept classes) implies PAC learnability in the (approximate) differentially private (DP) setting. We first consider their work in the real-valued setting and further extend their techniques to the setting of learning quantum states. Key to our results is our generic quantum online learner, Robust Standard Optimal Algorithm (RSOA), which is robust to adversarial imprecision. We then show information-theoretic implications between DP learning quantum states in the PAC model, learnability of quantum states in the one-way communication model, online learning of quantum states, quantum stability (which is our conceptual contribution), various combinatorial parameters and give further applications to gentle shadow tomography and noisy quantum state learning.

Quantum Physics Machine Learning

Differentially Private Model Publishing for Deep Learning

336 - Lei Yu , Ling Liu , Calton Pu 2019

Deep learning techniques based on neural networks have shown significant success in a wide range of AI tasks. Large-scale training datasets are one of the critical factors for their success. However, when the training datasets are crowdsourced from individuals and contain sensitive information, the model parameters may encode private information and bear the risks of privacy leakage. The recent growing trend of the sharing and publishing of pre-trained models further aggravates such privacy risks. To tackle this problem, we propose a differentially private approach for training neural networks. Our approach includes several new techniques for optimizing both privacy loss and model accuracy. We employ a generalization of differential privacy called concentrated differential privacy(CDP), with both a formal and refined privacy loss analysis on two different data batching methods. We implement a dynamic privacy budget allocator over the course of training to improve model accuracy. Extensive experiments demonstrate that our approach effectively improves privacy loss accounting, training efficiency and model quality under a given privacy budget.

Cryptography and Security Machine Learning

Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks

77 - Mihir Kale , Aditya Siddhant , Sreyashi Nag 2019

Pre-trained word embeddings are the primary method for transfer learning in several Natural Language Processing (NLP) tasks. Recent works have focused on using unsupervised techniques such as language modeling to obtain these embeddings. In contrast, this work focuses on extracting representations from multiple pre-trained supervised models, which enriches word embeddings with task and domain specific knowledge. Experiments performed in cross-task, cross-domain and cross-lingual settings indicate that such supervised embeddings are helpful, especially in the low-resource setting, but the extent of gains is dependent on the nature of the task and domain. We make our code publicly available.

Computation and Language Machine Learning

Private Deep Learning with Teacher Ensembles

107 - Lichao Sun , Yingbo Zhou , Ji Wang 2019

Privacy-preserving deep learning is crucial for deploying deep neural network based solutions, especially when the model works on data that contains sensitive information. Most privacy-preserving methods lead to undesirable performance degradation. Ensemble learning is an effective way to improve model performance. In this work, we propose a new method for teacher ensembles that uses more informative network outputs under differential private stochastic gradient descent and provide provable privacy guarantees. Out method employs knowledge distillation and hint learning on intermediate representations to facilitate the training of student model. Additionally, we propose a simple weighted ensemble scheme that works more robustly across different teaching settings. Experimental results on three common image datasets benchmark (i.e., CIFAR10, MINST, and SVHN) demonstrate that our approach outperforms previous state-of-the-art methods on both performance and privacy-budget.

Cryptography and Security Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

CAPE: Context-Aware Private Embeddings for Private Language Learning

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions