Do you want to publish a course? Click here

Overcoming the Curse of Dimensionality in Density Estimation with Mixed Sobolev GANs

118   0   0.0 ( 0 )
 Added by Shahin Shahrampour
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

We propose a novel GAN framework for non-parametric density estimation with high-dimensional data. This framework is based on a novel density estimator, called the hyperbolic cross density estimator, which enjoys nice convergence properties in the mixed Sobolev spaces. As modifications of the usual Sobolev spaces, the mixed Sobolev spaces are more suitable for describing high-dimensional density functions. We prove that, unlike other existing approaches, the proposed GAN framework does not suffer the curse of dimensionality and can achieve the optimal convergence rate of $O_p(n^{-1/2})$, with $n$ data points in an arbitrary fixed dimension. We also study the universality of GANs in terms of the existence of ReLU networks which can approximate the density functions in the mixed Sobolev spaces up to any accuracy level.



rate research

Read More

Adversarial training is a popular method to give neural nets robustness against adversarial perturbations. In practice adversarial training leads to low robust training loss. However, a rigorous explanation for why this happens under natural conditions is still missing. Recently a convergence theory for standard (non-adversarial) supervised training was developed by various groups for {em very overparametrized} nets. It is unclear how to extend these results to adversarial training because of the min-max objective. Recently, a first step towards this direction was made by Gao et al. using tools from online learning, but they require the width of the net to be emph{exponential} in input dimension $d$, and with an unnatural activation function. Our work proves convergence to low robust training loss for emph{polynomial} width instead of exponential, under natural assumptions and with the ReLU activation. Key element of our proof is showing that ReLU networks near initialization can approximate the step function, which may be of independent interest.
We undertake a precise study of the non-asymptotic properties of vanilla generative adversarial networks (GANs) and derive theoretical guarantees in the problem of estimating an unknown $d$-dimensional density $p^*$ under a proper choice of the class of generators and discriminators. We prove that the resulting density estimate converges to $p^*$ in terms of Jensen-Shannon (JS) divergence at the rate $(log n/n)^{2beta/(2beta+d)}$ where $n$ is the sample size and $beta$ determines the smoothness of $p^*.$ This is the first result in the literature on density estimation using vanilla GANs with JS rates faster than $n^{-1/2}$ in the regime $beta>d/2.$
We study minimax density estimation on the product space $mathbb{R}^{d_1}timesmathbb{R}^{d_2}$. We consider $L^p$-risk for probability density functions defined over regularity spaces that allow for different level of smoothness in each of the variables. Precisely, we study probabilities on Sobolev spaces with dominating mixed-smoothness. We provide the rate of convergence that is optimal even for the classical Sobolev spaces.
In this paper, we construct neural networks with ReLU, sine and $2^x$ as activation functions. For general continuous $f$ defined on $[0,1]^d$ with continuity modulus $omega_f(cdot)$, we construct ReLU-sine-$2^x$ networks that enjoy an approximation rate $mathcal{O}(omega_f(sqrt{d})cdot2^{-M}+omega_{f}left(frac{sqrt{d}}{N}right))$, where $M,Nin mathbb{N}^{+}$ denote the hyperparameters related to widths of the networks. As a consequence, we can construct ReLU-sine-$2^x$ network with the depth $5$ and width $maxleft{leftlceil2d^{3/2}left(frac{3mu}{epsilon}right)^{1/{alpha}}rightrceil,2leftlceillog_2frac{3mu d^{alpha/2}}{2epsilon}rightrceil+2right}$ that approximates $fin mathcal{H}_{mu}^{alpha}([0,1]^d)$ within a given tolerance $epsilon >0$ measured in $L^p$ norm $pin[1,infty)$, where $mathcal{H}_{mu}^{alpha}([0,1]^d)$ denotes the Holder continuous function class defined on $[0,1]^d$ with order $alpha in (0,1]$ and constant $mu > 0$. Therefore, the ReLU-sine-$2^x$ networks overcome the curse of dimensionality on $mathcal{H}_{mu}^{alpha}([0,1]^d)$. In addition to its supper expressive power, functions implemented by ReLU-sine-$2^x$ networks are (generalized) differentiable, enabling us to apply SGD to train.
Density ratio estimation serves as an important technique in the unsupervised machine learning toolbox. However, such ratios are difficult to estimate for complex, high-dimensional data, particularly when the densities of interest are sufficiently different. In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation. This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate. At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space. Empirically, we demonstrate the efficacy of our approach in a variety of downstream tasks that require access to accurate density ratios such as mutual information estimation, targeted sampling in deep generative models, and classification with data augmentation.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا