ﻻ يوجد ملخص باللغة العربية
We provide an end-to-end differentially private spectral algorithm for learning LDA, based on matrix/tensor decompositions, and establish theoretical guarantees on utility/consistency of the estimated model parameters. The spectral algorithm consists of multiple algorithmic steps, named as {edges}, to which noise could be injected to obtain differential privacy. We identify emph{subsets of edges}, named as {configurations}, such that adding noise to all edges in such a subset guarantees differential privacy of the end-to-end spectral algorithm. We characterize the sensitivity of the edges with respect to the input and thus estimate the amount of noise to be added to each edge for any required privacy level. We then characterize the utility loss for each configuration as a function of injected noise. Overall, by combining the sensitivity and utility characterization, we obtain an end-to-end differentially private spectral algorithm for LDA and identify the corresponding configuration that outperforms others in any specific regime. We are the first to achieve utility guarantees under the required level of differential privacy for learning in LDA. Overall our method systematically outperforms differentially private variational inference.
Assume we have potential causes $zin Z$, which produce events $w$ with known probabilities $beta(w|z)$. We observe $w_1,w_2,...,w_n$, what can we say about the distribution of the causes? A Bayesian estimate will assume a prior on distributions on $Z
Large data collections required for the training of neural networks often contain sensitive information such as the medical histories of patients, and the privacy of the training data must be preserved. In this paper, we introduce a dropout technique
A major challenge for machine learning is increasing the availability of data while respecting the privacy of individuals. Here we combine the provable privacy guarantees of the differential privacy framework with the flexibility of Gaussian processe
Deep neural networks with their large number of parameters are highly flexible learning systems. The high flexibility in such networks brings with some serious problems such as overfitting, and regularization is used to address this problem. A curren
A typical audio signal processing pipeline includes multiple disjoint analysis stages, including calculation of a time-frequency representation followed by spectrogram-based feature analysis. We show how time-frequency analysis and nonnegative matrix