No Arabic abstract
Measurement and estimation of parameters are essential for science and engineering, where one of the main quests is to find systematic schemes that can achieve high precision. While conventional schemes for quantum parameter estimation focus on the optimization of the probe states and measurements, it has been recently realized that control during the evolution can significantly improve the precision. The identification of optimal controls, however, is often computationally demanding, as typically the optimal controls depend on the value of the parameter which then needs to be re-calculated after the update of the estimation in each iteration. Here we show that reinforcement learning provides an efficient way to identify the controls that can be employed to improve the precision. We also demonstrate that reinforcement learning is highly generalizable, namely the neural network trained under one particular value of the parameter can work for different values within a broad range. These desired features make reinforcement learning an efficient alternative to conventional optimal quantum control methods.
Deep reinforcement learning has been recognized as an efficient technique to design optimal strategies for different complex systems without prior knowledge of the control landscape. To achieve a fast and precise control for quantum systems, we propose a novel deep reinforcement learning approach by constructing a curriculum consisting of a set of intermediate tasks defined by a fidelity threshold. Tasks among a curriculum can be statically determined using empirical knowledge or adaptively generated with the learning process. By transferring knowledge between two successive tasks and sequencing tasks according to their difficulties, the proposed curriculum-based deep reinforcement learning (CDRL) method enables the agent to focus on easy tasks in the early stage, then move onto difficult tasks, and eventually approaches the final task. Numerical simulations on closed quantum systems and open quantum systems demonstrate that the proposed method exhibits improved control performance for quantum systems and also provides an efficient way to identify optimal strategies with fewer control pulses.
Successful implementation of a fault-tolerant quantum computation on a system of qubits places severe demands on the hardware used to control the many-qubit state. It is known that an accuracy threshold $P_{a}$ exists for any quantum gate that is to be used in such a computation. Specifically, the error probability $P_{e}$ for such a gate must fall below the accuracy threshold: $P_{e} < P_{a}$. Estimates of $P_{a}$ vary widely, though $P_{a}sim 10^{-4}$ has emerged as a challenging target for hardware designers. In this paper we present a theoretical framework based on neighboring optimal control that takes as input a good quantum gate and returns a new gate with better performance. We illustrate this approach by applying it to all gates in a universal set of quantum gates produced using non-adiabatic rapid passage that has appeared in the literature. Performance improvements are substantial, both for ideal and non-ideal controls. Under suitable conditions detailed below, all gate error probabilities fall well below the target threshold of $10^{-4}$.
The ability to prepare a physical system in a desired quantum state is central to many areas of physics such as nuclear magnetic resonance, cold atoms, and quantum computing. Yet, preparing states quickly and with high fidelity remains a formidable challenge. In this work we implement cutting-edge Reinforcement Learning (RL) techniques and show that their performance is comparable to optimal control methods in the task of finding short, high-fidelity driving protocol from an initial to a target state in non-integrable many-body quantum systems of interacting qubits. RL methods learn about the underlying physical system solely through a single scalar reward (the fidelity of the resulting state) calculated from numerical simulations of the physical system. We further show that quantum state manipulation, viewed as an optimization problem, exhibits a spin-glass-like phase transition in the space of protocols as a function of the protocol duration. Our RL-aided approach helps identify variational protocols with nearly optimal fidelity, even in the glassy phase, where optimal state manipulation is exponentially hard. This study highlights the potential usefulness of RL for applications in out-of-equilibrium quantum physics.
We investigate simultaneous estimation of multi-parameter quantum estimation with time-dependent Hamiltonians. We analytically obtain the maximal quantum Fisher information matrix for two-parameter in time-dependent three-level systems. The optimal coherent control scheme is proposed to increase the estimation precisions. In a example of a spin-1 particle in a uniformly rotating magnetic field, the optimal coherent Hamiltonians for different parameters can be chosen to be completely same. However, in general, the optimal coherent Hamiltonians for different parameters are incompatibility. In this situation, we suggest a variance method to obtain the optimal coherent Hamiltonian for estimating multiple parameters simultaneously, and obtain the optimal simultaneous estimation precision of two-parameter in a three-level Landau-Zener Hamiltonian.
We study the quantum evolution of a non-Hermitian qubit realized as a sub-manifold of a dissipative superconducting transmon circuit. Real-time tuning of the system parameters results in non-reciprocal quantum state transfer associated with proximity to the exceptional points of the effective Floquet Hamiltonian. We observe chiral geometric phases accumulated under state transport, verifying the quantum coherent nature of the evolution in the complex energy landscape and distinguishing between coherent and incoherent effects associated with exceptional point encircling. Our work demonstrates an entirely new method for control over quantum state vectors, highlighting new facets of quantum bath engineering enabled through time-periodic (Floquet) non-Hermitian control.