No Arabic abstract
More computational resources (i.e., more physical qubits and qubit connections) on a superconducting quantum processor not only improve the performance but also result in more complex chip architecture with lower yield rate. Optimizing both of them simultaneously is a difficult problem due to their intrinsic trade-off. Inspired by the application-specific design principle, this paper proposes an automatic design flow to generate simplified superconducting quantum processor architecture with negligible performance loss for different quantum programs. Our architecture-design-oriented profiling method identifies program components and patterns critical to both the performance and the yield rate. A follow-up hardware design flow decomposes the complicated design procedure into three subroutines, each of which focuses on different hardware components and cooperates with corresponding profiling results and physical constraints. Experimental results show that our design methodology could outperform IBMs general-purpose design schemes with better Pareto-optimal results.
We have developed a quantum annealing processor, based on an array of tunably coupled rf-SQUID flux qubits, fabricated in a superconducting integrated circuit process [1]. Implementing this type of processor at a scale of 512 qubits and 1472 programmable inter-qubit couplers and operating at ~ 20 mK has required attention to a number of considerations that one may ignore at the smaller scale of a few dozen or so devices. Here we discuss some of these considerations, and the delicate balance necessary for the construction of a practical processor that respects the demanding physical requirements imposed by a quantum algorithm. In particular we will review some of the design trade-offs at play in the floor-planning of the physical layout, driven by the desire to have an algorithmically useful set of inter-qubit couplers, and the simultaneous need to embed programmable control circuitry into the processor fabric. In this context we have developed a new ultra-low power embedded superconducting digital-to-analog flux converters (DACs) used to program the processor with zero static power dissipation, optimized to achieve maximum flux storage density per unit area. The 512 single-stage, 3520 two-stage, and 512 three-stage flux-DACs are controlled with an XYZ addressing scheme requiring 56 wires. Our estimate of on-chip dissipated energy for worst-case reprogramming of the whole processor is ~ 65 fJ. Several chips based on this architecture have been fabricated and operated successfully at our facility, as well as two outside facilities (see for example [2]).
Early generations of superconducting quantum annealing processors have provided a valuable platform for studying the performance of a scalable quantum computing technology. These studies have directly informed our approach to the design of the next-generation processor. Our design priorities for this generation include an increase in per-qubit connectivity, a problem Hamiltonian energy scale similar to previous generations, reduced Hamiltonian specification errors, and an increase in the processor scale that also leaves programming and readout times fixed or reduced. Here we discuss the specific innovations that resulted in a processor architecture that satisfies these design priorities.
As progress is made towards the first generation of error-corrected quantum computers, careful characterization of a processors noise environment will be crucial to designing tailored, low-overhead error correction protocols. While standard coherence metrics and characterization protocols such as T1 and T2, process tomography, and randomized benchmarking are now ubiquitous, these techniques provide only partial information about the dynamic multi-qubit loss channels responsible for processor errors, which can be described more fully by a Lindblad operator in the master equation formalism. Here, we introduce and experimentally demonstrate Lindblad Tomography, a hardware-agnostic characterization protocol for tomographically reconstructing the Hamiltonian and Lindblad operators of a quantum channel from an ensemble of time-domain measurements. Performing Lindblad Tomography on a small superconducting quantum processor, we show that this technique characterizes and accounts for state-preparation and measurement (SPAM) errors and allows one to place strong bounds on the degree of non-Markovianity in the channels of interest. Comparing the results of single- and two-qubit measurements on a superconducting quantum processor, we demonstrate that Lindblad Tomography can also be used to identify and quantify sources of crosstalk on quantum processors, such as the presence of always-on qubit-qubit interactions.
A major hurdle to the deployment of quantum linear systems algorithms and recent quantum simulation algorithms lies in the difficulty to find inexpensive reversible circuits for arithmetic using existing hand coded methods. Motivated by recent advances in reversible logic synthesis, we synthesize arithmetic circuits using classical design automation flows and tools. The combination of classical and reversible logic synthesis enables the automatic design of large components in reversible logic starting from well-known hardware description languages such as Verilog. As a prototype example for our approach we automatically generate high quality networks for the reciprocal $1/x$, which is necessary for quantum linear systems algorithms.
Scaling up to a large number of qubits with high-precision control is essential in the demonstrations of quantum computational advantage to exponentially outpace the classical hardware and algorithmic improvements. Here, we develop a two-dimensional programmable superconducting quantum processor, textit{Zuchongzhi}, which is composed of 66 functional qubits in a tunable coupling architecture. To characterize the performance of the whole system, we perform random quantum circuits sampling for benchmarking, up to a system size of 56 qubits and 20 cycles. The computational cost of the classical simulation of this task is estimated to be 2-3 orders of magnitude higher than the previous work on 53-qubit Sycamore processor [Nature textbf{574}, 505 (2019)]. We estimate that the sampling task finished by textit{Zuchongzhi} in about 1.2 hours will take the most powerful supercomputer at least 8 years. Our work establishes an unambiguous quantum computational advantage that is infeasible for classical computation in a reasonable amount of time. The high-precision and programmable quantum computing platform opens a new door to explore novel many-body phenomena and implement complex quantum algorithms.