New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A Study of Policy Gradient on a Class of Exactly Solvable Models

60 0 0.0 ( 0 )

Download Cite

Added by Gavin McCracken

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Gavin McCracken - Colin Daniels - Rosie Zhao

Machine Learning Probability

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Policy gradient methods are extensively used in reinforcement learning as a way to optimize expected return. In this paper, we explore the evolution of the policy parameters, for a special class of exactly solvable POMDPs, as a continuous-state Markov chain, whose transition probabilities are determined by the gradient of the distribution of the policys value. Our approach relies heavily on random walk theory, specifically on affine Weyl groups. We construct a class of novel partially observable environments with controllable exploration difficulty, in which the value distribution, and hence the policy parameter evolution, can be derived analytically. Using these environments, we analyze the probabilistic convergence of policy gradient to different local maxima of the value function. To our knowledge, this is the first approach developed to analytically compute the landscape of policy gradient in POMDPs for a class of such environments, leading to interesting insights into the difficulty of this problem.

rate research

Multi-critical absorbing phase transition in a class of exactly solvable models

123 - Arijit Chatterjee , P. K. Mohanty 2016

We study diffusion of hardcore particles on a one dimensional periodic lattice subjected to a constraint that the separation between any two consecutive particles does not increase beyond a fixed value $(n+1);$ initial separation larger than $(n+1)$ can however decrease. These models undergo an absorbing state phase transition when the conserved particle density of the system falls bellow a critical threshold $rho_c= 1/(n+1).$ We find that $phi_k$s, the density of $0$-clusters ($0$ representing vacancies) of size $0le k<n,$ vanish at the transition point along with activity density $rho_a$. The steady state of these models can be written in matrix product form to obtain analytically the static exponents $beta_k= n-k, u=1=eta$ corresponding to each $phi_k$. We also show from numerical simulations that starting from a natural condition, $phi_k(t)$s decay as $t^{-alpha_k}$ with $alpha_k= (n-k)/2$ even though other dynamic exponents $ u_t=2=z$ are independent of $k$; this ensures the validity of scaling laws $beta= alpha u_t,$ $ u_t = z u$.

Statistical Mechanics

Exactly solvable models of nuclei

168 - P. Van Isacker , K. Heyde 2014

In this paper a review is given of a class of sub-models of both approaches, characterized by the fact that they can be solved exactly, highlighting in the process a number of generic results related to both the nature of pair-correlated systems as well as collective modes of motion in the atomic nucleus.

Nuclear Theory

Active Absorbing State Phase Transition Beyond Directed Percolation : A Class of Exactly Solvable Models

145 - Urna Basu , P. K. Mohanty 2009

We introduce and solve a model of hardcore particles on a one dimensional periodic lattice which undergoes an active-absorbing state phase transition at finite density. In this model an occupied site is defined to be active if its left neighbour is occupied and the right neighbour is vacant. Particles from such active sites hop stochastically to their right. We show that, both the density of active sites and the survival probability vanish as the particle density is decreased below half. The critical exponents and spatial correlations of the model are calculated exactly using the matrix product ansatz. Exact analytical study of several variations of the model reveals that these non-equilibrium phase transitions belong to a new universality class different from the generic active-absorbing-state phase transition, namely directed percolation.

Statistical Mechanics Soft Condensed Matter

Exactly Solvable Pairing Models

76 - J. P. Draayer , V. G. Gueorguiev , K. D. Sviratcheva 2018

Some results for two distinct but complementary exactly solvable algebraic models for pairing in atomic nuclei are presented: 1) binding energy predictions for isotopic chains of nuclei based on an extended pairing model that includes multi-pair excitations; and 2) fine structure effects among excited $0^+$ states in $N approx Z$ nuclei that track with the proton-neutron ($pn$) and like-particle isovector pairing interactions as realized within an algebraic $sp(4)$ shell model. The results show that these models can be used to reproduce significant ranges of known experimental data, and in so doing, confirm their power to predict pairing-dominated phenomena in domains where data is unavailable.

Nuclear Theory

A Nonparametric Off-Policy Policy Gradient

348 - Samuele Tosatto , Joao Carvalho , Hany Abdulsamad 2020

Reinforcement learning (RL) algorithms still suffer from high sample complexity despite outstanding recent successes. The need for intensive interactions with the environment is especially observed in many widely popular policy gradient algorithms that perform updates using on-policy samples. The price of such inefficiency becomes evident in real-world scenarios such as interaction-driven robot learning, where the success of RL has been rather limited. We address this issue by building on the general sample efficiency of off-policy algorithms. With nonparametric regression and density estimation methods we construct a nonparametric Bellman equation in a principled manner, which allows us to obtain closed-form estimates of the value function, and to analytically express the full policy gradient. We provide a theoretical analysis of our estimate to show that it is consistent under mild smoothness assumptions and empirically show that our approach has better sample efficiency than state-of-the-art policy gradient methods.

Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Study of Policy Gradient on a Class of Exactly Solvable Models

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions