We introduce reinforcement learning (RL) formulations of the problem of finding the ground state of a many-body quantum mechanical model defined on a lattice. We show that stoquastic Hamiltonians - those without a sign problem - have a natural decomposition into stochastic dynamics and a potential representing a reward function. The mapping to RL is developed for both continuous and discrete time, based on a generalized Feynman-Kac formula in the former case and a stochastic representation of the Schrodinger equation in the latter. We discuss the application of this mapping to the neural representation of quantum states, spelling out the advantages over approaches based on direct representation of the wavefunction of the system.