ترغب بنشر مسار تعليمي؟ اضغط هنا

Playing Atari with Hybrid Quantum-Classical Reinforcement Learning

74   0   0.0 ( 0 )
 نشر من قبل Owen Lockwood
 تاريخ النشر 2021
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Despite the successes of recent works in quantum reinforcement learning, there are still severe limitations on its applications due to the challenge of encoding large observation spaces into quantum systems. To address this challenge, we propose using a neural network as a data encoder, with the Atari games as our testbed. Specifically, the neural network converts the pixel input from the games to quantum data for a Quantum Variational Circuit (QVC); this hybrid model is then used as a function approximator in the Double Deep Q Networks algorithm. We explore a number of variations of this algorithm and find that our proposed hybrid models do not achieve meaningful results on two Atari games - Breakout and Pong. We suspect this is due to the significantly reduced sizes of the hybrid quantum-classical systems.

قيم البحث

اقرأ أيضاً

We report on experimental implementation of a machine-learned quantum gate driven by a classical control. The gate learns optimal phase-covariant cloning in a reinforcement learning scenario having fidelity of the clones as reward. In our experiment, the gate learns to achieve nearly optimal cloning fidelity allowed for this particular class of states. This makes it a proof of present-day feasibility and practical applicability of the hybrid machine learning approach combining quantum information processing with classical control. Moreover, our experiment can be directly generalized to larger interferometers where the computational cost of classical computer is much lower than the cost of boson sampling.
Hamiltonian learning is crucial to the certification of quantum devices and quantum simulators. In this paper, we propose a hybrid quantum-classical Hamiltonian learning algorithm to find the coefficients of the Pauli operator components of the Hamil tonian. Its main subroutine is the practical log-partition function estimation algorithm, which is based on the minimization of the free energy of the system. Concretely, we devise a stochastic variational quantum eigensolver (SVQE) to diagonalize the Hamiltonians and then exploit the obtained eigenvalues to compute the free energys global minimum using convex optimization. Our approach not only avoids the challenge of estimating von Neumann entropy in free energy minimization, but also reduces the quantum resources via importance sampling in Hamiltonian diagonalization, facilitating the implementation of our method on near-term quantum devices. Finally, we demonstrate our approachs validity by conducting numerical experiments with Hamiltonians of interest in quantum many-body physics.
Learning a hidden parity function from noisy data, known as learning parity with noise (LPN), is an example of intelligent behavior that aims to generalize a concept based on noisy examples. The solution to LPN immediately leads to decoding a random binary linear code in the presence of classification noise. This problem is thought to be intractable classically, but can be solved efficiently if a quantum oracle can be queried. However, in practice, a learner is more likely to receive data from classical oracles. In this work, we show that a naive application of the quantum LPN algorithm to classical data encoded in an equal superposition state requires an exponential sample complexity. We then propose a quantum-classical reinforcement learning algorithm to solve the LPN problem for data generated by a classical oracle and demonstrate a significant reduction in the sample complexity. Simulations with a hidden bit string of length up to 12 show that the quantum-classical reinforcement learning performs better than known classical algorithms when the sample complexity, run time, and robustness to classical noise are collectively considered. Our algorithm is robust to any noise in the quantum circuit that effectively appears as Pauli errors on the final state.
Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, th an a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.
369 - Paolo Aniello 2014
A function of positive type can be defined as a positive functional on a convolution algebra of a locally compact group. In the case where the group is abelian, by Bochners theorem a function of positive type is, up to normalization, the Fourier tran sform of a probability measure. Therefore, considering the group of translations on phase space, a suitably normalized phase-space function of positive type can be regarded as a realization of a classical state. Thus, it may be called a function of classical positive type. Replacing the ordinary convolution on phase space with the twisted convolution, one obtains a noncommutative algebra of functions whose positive functionals we may call functions of quantum positive type. In fact, by a quantum version of Bochners theorem, a continuous function of quantum positive type is, up to normalization, the (symplectic) Fourier transform of a Wigner quasi-probability distribution; hence, it can be regarded as a phase-space realization of a quantum state. Playing with functions of positive type, classical and quantum, one is led in a natural way to consider a class of semigroups of operators, the classical-quantum semigroups. The physical meaning of these mathematical objects is unveiled via quantization, so obtaining a class of quantum dynamical semigroups that, borrowing terminology from quantum information science, may be called classical-noise semigroups.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا