Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A Tensor Network Approach to Finite Markov Decision Processes

61 0 0.0 ( 0 )

Download Cite

Added by Edward Gillman

Publication date 2020

fields Physics Informatics Engineering

and research's language is English

Authors Edward Gillman - Dominic C. Rose - Juan P. Garrahan

Statistical Mechanics Machine Learning Quantum Physics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Tensor network (TN) techniques - often used in the context of quantum many-body physics - have shown promise as a tool for tackling machine learning (ML) problems. The application of TNs to ML, however, has mostly focused on supervised and unsupervised learning. Yet, with their direct connection to hidden Markov chains, TNs are also naturally suited to Markov decision processes (MDPs) which provide the foundation for reinforcement learning (RL). Here we introduce a general TN formulation of finite, episodic and discrete MDPs. We show how this formulation allows us to exploit algorithms developed for TNs for policy optimisation, the key aim of RL. As an application we consider the issue - formulated as an RL problem - of finding a stochastic evolution that satisfies specific dynamical conditions, using the simple example of random walk excursions as an illustration.

rate research

Prediction and Generation of Binary Markov Processes: Can a Finite-State Fox Catch a Markov Mouse?

151 - J. Ruebeck , R. G. James , J. R. Mahoney 2017

Understanding the generative mechanism of a natural system is a vital component of the scientific method. Here, we investigate one of the fundamental steps toward this goal by presenting the minimal generator of an arbitrary binary Markov process. This is a class of processes whose predictive model is well known. Surprisingly, the generative model requires three distinct topologies for different regions of parameter space. We show that a previously proposed generator for a particular set of binary Markov processes is, in fact, not minimal. Our results shed the first quantitative light on the relative (minimal) costs of prediction and generation. We find, for instance, that the difference between prediction and generation is maximized when the process is approximately independently, identically distributed.

Statistical Mechanics Computational Complexity Information Theory

A Gauss-Newton Method for Markov Decision Processes

63 - Thomas Furmston , Guy Lever 2015

Approximate Newton methods are a standard optimization tool which aim to maintain the benefits of Newtons method, such as a fast rate of convergence, whilst alleviating its drawbacks, such as computationally expensive calculation or estimation of the inverse Hessian. In this work we investigate approximate Newton methods for policy optimization in Markov Decision Processes (MDPs). We first analyse the structure of the Hessian of the objective function for MDPs. We show that, like the gradient, the Hessian exhibits useful structure in the context of MDPs and we use this analysis to motivate two Gauss-Newton Methods for MDPs. Like the Gauss-Newton method for non-linear least squares, these methods involve approximating the Hessian by ignoring certain terms in the Hessian which are difficult to estimate. The approximate Hessians possess desirable properties, such as negative definiteness, and we demonstrate several important performance guarantees including guaranteed ascent directions, invariance to affine transformation of the parameter space, and convergence guarantees. We finally provide a unifying perspective of key policy search algorithms, demonstrating that our second Gauss-Newton algorithm is closely related to both the EM-algorithm and natural gradient ascent applied to MDPs, but performs significantly better in practice on a range of challenging domains.

Artificial Intelligence Machine Learning Machine Learning

Markov chains for tensor network states

623 - S. Iblisdir 2013

Markov chains for probability distributions related to matrix product states and 1D Hamiltonians are introduced. With appropriate inverse temperature schedules, these chains can be combined into a random approximation scheme for ground states of such Hamiltonians. Numerical experiments suggest that a linear, i.e. fast, schedule is possible in non-trivial cases. A natural extension of these chains to 2D settings is next presented and tested. The obtained results compare well with Euclidean evolution. The proposed Markov chains are easy to implement and are inherently sign problem free (even for fermionic degrees of freedom).

Strongly Correlated Electrons High Energy Physics - Lattice Quantum Physics

Robust Batch Policy Learning in Markov Decision Processes

59 - Zhengling Qi , Peng Liao 2020

We study the sequential decision making problem in Markov decision process (MDP) where each policy is evaluated by a set containing average rewards over different horizon lengths and with different initial distributions. Given a pre-collected dataset of multiple trajectories generated by some behavior policy, our goal is to learn a robust policy in a pre-specified policy class that can maximize the smallest value of this set. Leveraging the semi-parametric efficiency theory from statistics, we develop a policy learning method for estimating the defined robust optimal policy that can efficiently break the curse of horizon under mild technical conditions. A rate-optimal regret bound up to a logarithmic factor is established in terms of the number of trajectories and the number of decision points.

Statistics Theory Machine Learning Machine Learning

A tensor network approach to 2D bosonization

93 - Sujeet K. Shukla , Tyler D. Ellison , Lukasz Fidkowski 2019

We present a 2D bosonization duality using the language of tensor networks. Specifically, we construct a tensor network operator (TNO) that implements an exact 2D bosonization duality. The primary benefit of the TNO is that it allows for bosonization at the level of quantum states. Thus, we use the TNO to provide an explicit algorithm for bosonizing fermionic projected entangled pair states (fPEPs). A key step in the algorithm is to account for a choice of spin-structure, encoded in a set of bonds of the bosonized fPEPS. This enables our tensor network approach to bosonization to be applied to systems on arbitrary triangulations of orientable 2D manifolds.

Strongly Correlated Electrons

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Tensor Network Approach to Finite Markov Decision Processes

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions