No Arabic abstract
Learning an unknown $n$-qubit quantum state $rho$ is a fundamental challenge in quantum computing. Information-theoretically, it is known that tomography requires exponential in $n$ many copies of $rho$ to estimate it up to trace distance. Motivated by computational learning theory, Aaronson et al. introduced many (weaker) learning models: the PAC model of learning states (Proceedings of Royal Society A07), shadow tomography (STOC18) for learning shadows of a state, a model that also requires learners to be differentially private (STOC19) and the online model of learning states (NeurIPS18). In these models it was shown that an unknown state can be learned approximately using linear-in-$n$ many copies of rho. But is there any relationship between these models? In this paper we prove a sequence of (information-theoretic) implications from differentially-private PAC learning, to communication complexity, to online learning and then to quantum stability. Our main result generalizes the recent work of Bun, Livni and Moran (Journal of the ACM21) who showed that finite Littlestone dimension (of Boolean-valued concept classes) implies PAC learnability in the (approximate) differentially private (DP) setting. We first consider their work in the real-valued setting and further extend their techniques to the setting of learning quantum states. Key to our results is our generic quantum online learner, Robust Standard Optimal Algorithm (RSOA), which is robust to adversarial imprecision. We then show information-theoretic implications between DP learning quantum states in the PAC model, learnability of quantum states in the one-way communication model, online learning of quantum states, quantum stability (which is our conceptual contribution), various combinatorial parameters and give further applications to gentle shadow tomography and noisy quantum state learning.
Differentially private (DP) learning, which aims to accurately extract patterns from the given dataset without exposing individual information, is an important subfield in machine learning and has been extensively explored. However, quantum algorithms that could preserve privacy, while outperform their classical counterparts, are still lacking. The difficulty arises from the distinct priorities in DP and quantum machine learning, i.e., the former concerns a low utility bound while the latter pursues a low runtime cost. These varied goals request that the proposed quantum DP algorithm should achieve the runtime speedup over the best known classical results while preserving the optimal utility bound. The Lasso estimator is broadly employed to tackle the high dimensional sparse linear regression tasks. The main contribution of this paper is devising a quantum DP Lasso estimator to earn the runtime speedup with the privacy preservation, i.e., the runtime complexity is $tilde{O}(N^{3/2}sqrt{d})$ with a nearly optimal utility bound $tilde{O}(1/N^{2/3})$, where $N$ is the sample size and $d$ is the data dimension with $Nll d$. Since the optimal classical (private) Lasso takes $Omega(N+d)$ runtime, our proposal achieves quantum speedups when $N<O(d^{1/3})$. There are two key components in our algorithm. First, we extend the Frank-Wolfe algorithm from the classical Lasso to the quantum scenario, {where the proposed quantum non-private Lasso achieves a quadratic runtime speedup over the optimal classical Lasso.} Second, we develop an adaptive privacy mechanism to ensure the privacy guarantee of the non-private Lasso. Our proposal opens an avenue to design various learning tasks with both the proven runtime speedups and the privacy preservation.
We design differentially private learning algorithms that are agnostic to the learning model. Our algorithms are interactive in nature, i.e., instead of outputting a model based on the training data, they provide predictions for a set of $m$ feature vectors that arrive online. We show that, for the feature vectors on which an ensemble of models (trained on random disjoint subsets of a dataset) makes consistent predictions, there is almost no-cost of privacy in generating accurate predictions for those feature vectors. To that end, we provide a novel coupling of the distance to instability framework with the sparse vector technique. We provide algorithms with formal privacy and utility guarantees for both binary/multi-class classification, and soft-label classification. For binary classification in the standard (agnostic) PAC model, we show how to bootstrap from our privately generated predictions to construct a computationally efficient private learner that outputs a final accurate hypothesis. Our construction - to the best of our knowledge - is the first computationally efficient construction for a label-private learner. We prove sample complexity upper bounds for this setting. As in non-private sample complexity bounds, the only relevant property of the given concept class is its VC dimension. For soft-label classification, our techniques are based on exploiting the stability properties of traditional learning algorithms, like stochastic gradient descent (SGD). We provide a new technique to boost the average-case stability properties of learning algorithms to strong (worst-case) stability properties, and then exploit them to obtain private classification algorithms. In the process, we also show that a large class of SGD methods satisfy average-case stability properties, in contrast to a smaller class of SGD methods that are uniformly stable as shown in prior work.
In Private Broadcasting, a single plaintext is broadcast to multiple recipients in an encrypted form, such that each recipient can decrypt locally. When the message is classical, a straightforward solution is to encrypt the plaintext with a single key shared among all parties, and to send to each recipient a copy of the ciphertext. Surprisingly, the analogous method is insufficient in the case where the message is quantum (i.e. in Quantum Private Broadcasting (QPB)). In this work, we give three solutions to $t$-recipient Quantum Private Broadcasting ($t$-QPB) and compare them in terms of key lengths. The first method is the independent encryption with the quantum one-time pad, which requires a key linear in the number of recipients, $t$. We show that the key length can be decreased to be logarithmic in $t$ by using unitary $t$-designs. Our main contribution is to show that this can be improved to a key length that is logarithmic in the dimension of the symmetric subspace, using a new concept that we define of symmetric unitary $t$-designs, that may be of independent interest.
The study of properties of randomly chosen quantum states has in recent years led to many insights into quantum entanglement. In this work, we study private quantum states from this point of view. Private quantum states are bipartite quantum states characterised by the property that carrying out simple local measurements yields a secret bit. This feature is shared by the maximally entangled pair of quantum bits, yet private quantum states are more general and can in their most extreme form be almost bound entangled. In this work, we study the entanglement properties of random private quantum states and show that they are hardly distinguishable from separable states and thus have low repeatable key, despite containing one bit of key. The technical tools we develop are centred around the concept of locally restricted measurements and include a new operator ordering, bounds on norms under tensoring with entangled states and a continuity bound for a relative entropy measure.
Deep learning-based language models have achieved state-of-the-art results in a number of applications including sentiment analysis, topic labelling, intent classification and others. Obtaining text representations or embeddings using these models presents the possibility of encoding personally identifiable information learned from language and context cues that may present a risk to reputation or privacy. To ameliorate these issues, we propose Context-Aware Private Embeddings (CAPE), a novel approach which preserves privacy during training of embeddings. To maintain the privacy of text representations, CAPE applies calibrated noise through differential privacy, preserving the encoded semantic links while obscuring sensitive information. In addition, CAPE employs an adversarial training regime that obscures identified private variables. Experimental results demonstrate that the proposed approach reduces private information leakage better than either single intervention.