ﻻ يوجد ملخص باللغة العربية
Tensor network (TN) techniques - often used in the context of quantum many-body physics - have shown promise as a tool for tackling machine learning (ML) problems. The application of TNs to ML, however, has mostly focused on supervised and unsupervised learning. Yet, with their direct connection to hidden Markov chains, TNs are also naturally suited to Markov decision processes (MDPs) which provide the foundation for reinforcement learning (RL). Here we introduce a general TN formulation of finite, episodic and discrete MDPs. We show how this formulation allows us to exploit algorithms developed for TNs for policy optimisation, the key aim of RL. As an application we consider the issue - formulated as an RL problem - of finding a stochastic evolution that satisfies specific dynamical conditions, using the simple example of random walk excursions as an illustration.
Understanding the generative mechanism of a natural system is a vital component of the scientific method. Here, we investigate one of the fundamental steps toward this goal by presenting the minimal generator of an arbitrary binary Markov process. Th
Approximate Newton methods are a standard optimization tool which aim to maintain the benefits of Newtons method, such as a fast rate of convergence, whilst alleviating its drawbacks, such as computationally expensive calculation or estimation of the
Markov chains for probability distributions related to matrix product states and 1D Hamiltonians are introduced. With appropriate inverse temperature schedules, these chains can be combined into a random approximation scheme for ground states of such
We study the sequential decision making problem in Markov decision process (MDP) where each policy is evaluated by a set containing average rewards over different horizon lengths and with different initial distributions. Given a pre-collected dataset
We present a 2D bosonization duality using the language of tensor networks. Specifically, we construct a tensor network operator (TNO) that implements an exact 2D bosonization duality. The primary benefit of the TNO is that it allows for bosonization