No Arabic abstract
Quantum Machine Learning (QML) is a young but rapidly growing field where quantum information meets machine learning. Here, we will introduce a new QML model generalizing the classical concept of Reinforcement Learning to the quantum domain, i.e. Quantum Reinforcement Learning (QRL). In particular we apply this idea to the maze problem, where an agent has to learn the optimal set of actions in order to escape from a maze with the highest success probability. To perform the strategy optimization, we consider an hybrid protocol where QRL is combined with classical deep neural networks. In particular, we find that the agent learns the optimal strategy in both the classical and quantum regimes, and we also investigate its behaviour in a noisy environment. It turns out that the quantum speedup does robustly allow the agent to exploit useful actions also at very short time scales, with key roles played by the quantum coherence and the external noise. This new framework has the high potential to be applied to perform different tasks (e.g. high transmission/processing rates and quantum error correction) in the new-generation Noisy Intermediate-Scale Quantum (NISQ) devices whose topology engineering is starting to become a new and crucial control knob for practical applications in real-world problems. This work is dedicated to the memory of Peter Wittek.
Finding the ground state of a quantum mechanical system can be formulated as an optimal control problem. In this formulation, the drift of the optimally controlled process is chosen to match the distribution of paths in the Feynman--Kac (FK) representation of the solution of the imaginary time Schrodinger equation. This provides a variational principle that can be used for reinforcement learning of a neural representation of the drift. Our approach is a drop-in replacement for path integral Monte Carlo, learning an optimal importance sampler for the FK trajectories. We demonstrate the applicability of our approach to several problems of one-, two-, and many-particle physics.
We introduce reinforcement learning (RL) formulations of the problem of finding the ground state of a many-body quantum mechanical model defined on a lattice. We show that stoquastic Hamiltonians - those without a sign problem - have a natural decomposition into stochastic dynamics and a potential representing a reward function. The mapping to RL is developed for both continuous and discrete time, based on a generalized Feynman-Kac formula in the former case and a stochastic representation of the Schrodinger equation in the latter. We discuss the application of this mapping to the neural representation of quantum states, spelling out the advantages over approaches based on direct representation of the wavefunction of the system.
Classical models with complex energy landscapes represent a perspective avenue for the near-term application of quantum simulators. Until now, many theoretical works studied the performance of quantum algorithms for models with a unique ground state. However, when the classical problem is in a so-called clustering phase, the ground state manifold is highly degenerate. As an example, we consider a 3-XORSAT model defined on simple hypergraphs. The degeneracy of classical ground state manifold translates into the emergence of an extensive number of $Z_2$ symmetries, which remain intact even in the presence of a quantum transverse magnetic field. We establish a general duality approach that restricts the quantum problem to a given sector of conserved $Z_2$ charges and use it to study how the outcome of the quantum adiabatic algorithm depends on the hypergraph geometry. We show that the tree hypergraph which corresponds to a classically solvable instance of the 3-XORSAT problem features a constant gap, whereas the closed hypergraph encounters a second-order phase transition with a gap vanishing as a power-law in the problem size. The duality developed in this work provides a practical tool for studies of quantum models with classically degenerate energy manifold and reveals potential connections between glasses and gauge theories.
The solution space of many classical optimization problems breaks up into clusters which are extensively distant from one another in the Hamming metric. Here, we show that an analogous quantum clustering phenomenon takes place in the ground state subspace of a certain quantum optimization problem. This involves extending the notion of clustering to Hilbert space, where the classical Hamming distance is not immediately useful. Quantum clusters correspond to macroscopically distinct subspaces of the full quantum ground state space which grow with the system size. We explicitly demonstrate that such clusters arise in the solution space of random quantum satisfiability (3-QSAT) at its satisfiability transition. We estimate both the number of these clusters and their internal entropy. The former are given by the number of hardcore dimer coverings of the core of the interaction graph, while the latter is related to the underconstrained degrees of freedom not touched by the dimers. We additionally provide new numerical evidence suggesting that the 3-QSAT satisfiability transition may coincide with the product satisfiability transition, which would imply the absence of an intermediate entangled satisfiable phase.
The key approaches for machine learning, especially learning in unknown probabilistic environments are new representations and computation mechanisms. In this paper, a novel quantum reinforcement learning (QRL) method is proposed by combining quantum theory and reinforcement learning (RL). Inspired by the state superposition principle and quantum parallelism, a framework of value updating algorithm is introduced. The state (action) in traditional RL is identified as the eigen state (eigen action) in QRL. The state (action) set can be represented with a quantum superposition state and the eigen state (eigen action) can be obtained by randomly observing the simulated quantum state according to the collapse postulate of quantum measurement. The probability of the eigen action is determined by the probability amplitude, which is parallelly updated according to rewards. Some related characteristics of QRL such as convergence, optimality and balancing between exploration and exploitation are also analyzed, which shows that this approach makes a good tradeoff between exploration and exploitation using the probability amplitude and can speed up learning through the quantum parallelism. To evaluate the performance and practicability of QRL, several simulated experiments are given and the results demonstrate the effectiveness and superiority of QRL algorithm for some complex problems. The present work is also an effective exploration on the application of quantum computation to artificial intelligence.