أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yi-An Ma

High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm

178 - Wenlong Mou , Yi-An Ma , Martin J. Wainwright 2019

We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-order Langevin dynamics for sampling from distributions with log-concave and smooth densities. The higher-order dynamics allow for more flexible discretization schemes, and we deve lop a specific method that combines splitting with more accurate integration. For a broad class of $d$-dimensional distributions arising from generalized linear models, we prove that the resulting third-order algorithm produces samples from a distribution that is at most $varepsilon > 0$ in Wasserstein distance from the target distribution in $Oleft(frac{d^{1/4}}{ varepsilon^{1/2}} right)$ steps. This result requires only Lipschitz conditions on the gradient. For general strongly convex potentials with $alpha$-th order smoothness, we prove that the mixing time scales as $O left(frac{d^{1/4}}{varepsilon^{1/2}} + frac{d^{1/2}}{varepsilon^{1/(alpha - 1)}} right)$.

التعلم الالي بنى وهياكل البيانات والخوارزميات التعلم الآلي

Bayesian Robustness: A Nonasymptotic Viewpoint

111 - Kush Bhatia , Yi-An Ma , Anca D. Dragan 2019

We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers. We propose Rob-ULA, a robust variant of the Unadjusted Langevin Algorithm (ULA), and provide a finite-sample analysis of its sampling distribution. In particular, we show that after $T= tilde{mathcal{O}}(d/varepsilon_{textsf{acc}})$ iterations, we can sample from $p_T$ such that $text{dist}(p_T, p^*) leq varepsilon_{textsf{acc}} + tilde{mathcal{O}}(epsilon)$, where $epsilon$ is the fraction of corruptions. We corroborate our theoretical analysis with experiments on both synthetic and real-world data sets for mean estimation, regression and binary classification.

التعلم الالي التعلم الآلي حساب

Is There an Analog of Nesterov Acceleration for MCMC?

91 - Yi-An Ma , Niladri Chatterji , Xiang Cheng 2019

We formulate gradient-based Markov chain Monte Carlo (MCMC) sampling as optimization on the space of probability measures, with Kullback-Leibler (KL) divergence as the objective functional. We show that an underdamped form of the Langevin algorithm p erforms accelerated gradient descent in this metric. To characterize the convergence of the algorithm, we construct a Lyapunov functional and exploit hypocoercivity of the underdamped Langevin algorithm. As an application, we show that accelerated rates can be obtained for a class of nonconvex functions with the Langevin algorithm.

التعلم الالي التعلم الآلي التحليل العددي

Sampling Can Be Faster Than Optimization

106 - Yi-An Ma , Yuansi Chen , Chi Jin 2018

Optimization algorithms and Monte Carlo sampling algorithms have provided the computational foundations for the rapid growth in applications of statistical machine learning in recent years. There is, however, limited theoretical understanding of the relationships between these two kinds of methodology, and limited understanding of relative strengths and weaknesses. Moreover, existing results have been obtained primarily in the setting of convex functions (for optimization) and log-concave functions (for sampling). In this setting, where local properties determine global properties, optimization algorithms are unsurprisingly more efficient computationally than sampling algorithms. We instead examine a class of nonconvex objective functions that arise in mixture modeling and multi-stable systems. In this nonconvex setting, we find that the computational complexity of sampling algorithms scales linearly with the model dimension while that of optimization algorithms scales exponentially.

التعلم الالي التعلم الآلي

Stochastic Gradient MCMC Methods for Hidden Markov Models

120 - Yi-An Ma , Nicholas J. Foti , Emily B. Fox 2017

Stochastic gradient MCMC (SG-MCMC) algorithms have proven useful in scaling Bayesian inference to large datasets under an assumption of i.i.d data. We instead develop an SG-MCMC algorithm to learn the parameters of hidden Markov models (HMMs) for tim e-dependent data. There are two challenges to applying SG-MCMC in this setting: The latent discrete states, and needing to break dependencies when considering minibatches. We consider a marginal likelihood representation of the HMM and propose an algorithm that harnesses the inherent memory decay of the process. We demonstrate the effectiveness of our algorithm on synthetic experiments and an ion channel recording data, with runtimes significantly outperforming batch MCMC.

التعلم الالي

Irreversible Samplers from Jump and Continuous Markov Processes

142 - Yi-An Ma , Emily B. Fox , Tianqi Chen 2016

In this paper, we propose irreversib

المنهجية

A Complete Recipe for Stochastic Gradient MCMC

89 - Yi-An Ma , Tianqi Chen , Emily B. Fox 2015

Many recent Markov chain Monte Carlo (MCMC) samplers leverage continuous dynamics to define a transition kernel that efficiently explores a target distribution. In tandem, a focus has been on devising scalable variants that subsample the data and use stochastic gradients in place of full-data gradients in the dynamic simulations. However, such stochastic gradient MCMC samplers have lagged behind their full-data counterparts in terms of the complexity of dynamics considered since proving convergence in the presence of the stochastic gradient noise is non-trivial. Even with simple dynamics, significant physical intuition is often required to modify the dynamical system to account for the stochastic gradient noise. In this paper, we provide a general recipe for constructing MCMC samplers--including stochastic gradie

نظرية الإحصاء المنهجية التعلم الالي

Universal Ideal Behavior and Macroscopic Work Relation of Linear Irreversible Stochastic Thermodynamics

76 - Yi-An Ma , Hong Qian 2015

We revisit the Ornstein-Uhlenbeck (OU) process as the fundamental mathematical description of linear irreversible phenomena, with fluctuations, near an equilibrium. By identifying the underlying circulating dynamics in a stationary process as the nat ural generalization of classical conservative mechanics, a bridge between a family of OU processes with equilibrium fluctuations and thermodynamics is established through the celebrated Helmholtz theorem. The Helmholtz theorem provides an emergent macroscopic equation of state of the entire system, which exhibits a universal ideal thermodynamic behavior. Fluctuating macroscopic quantities are studied from the stochastic thermodynamic point of view and a non-equilibrium work relation is obtained in the macroscopic picture, which may facilitate experimental study and application of the equalities due to Jarzynski, Crooks, and Hatano and Sasa.

الميكانيكا الإحصائية تحليل البيانات والإحصاءات والاحتمال

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد