No Arabic abstract
The quantum simulation kernel is an important subroutine appearing as a very long gate sequence in many quantum programs. In this paper, we propose Paulihedral, a block-wise compiler framework that can deeply optimize this subroutine by exploiting high-level program structure and optimization opportunities. Paulihedral first employs a new Pauli intermediate representation that can maintain the high-level semantics and constraints in quantum simulation kernels. This naturally enables new large-scale optimizations that are hard to implement at the low gate-level. In particular, we propose two technology-independent instruction scheduling passes, and two technology-dependent code optimization passes which reconcile the circuit synthesis, gate cancellation, and qubit mapping stages of the compiler. Experimental results show that Paulihedral can outperform state-of-the-art compiler infrastructures in a wide-range of applications on both near-term superconducting quantum processors and future fault-tolerant quantum computers.
In this paper, we propose a novel quantum compiler optimization, named relaxed peephole optimization (RPO) for quantum computers. RPO leverages the single-qubit state information that can be determined statically by the compiler. We define that a qubit is in a basis state when, at a given point in time, its state is either in the X-, Y-, or Z-basis. When basis qubits are used as inputs to quantum gates, there exist opportunities for strength reduction, which replaces quantum operations with equivalent but less expensive ones. Compared to the existing peephole optimization for quantum programs, the difference is that our proposed optimization does not require an identical unitary matrix, thereby named `relaxed peephole optimization. We also extend our approach to optimize the quantum gates when some input qubits are in known pure states. Both optimizations, namely the Quantum Basis-state Optimization (QBO) and the Quantum Pure-state Optimization (QPO), are implemented in the IBMs Qiskit transpiler. Our experimental results show that our proposed optimization pass is fast and effective. The circuits optimized with our compiler optimizations obtain up to 18.0% (11.7% on average) fewer CNOT gates and up to 8.2% (7.1% on average) lower transpilation time than that of the most aggressive optimization level in the Qiskit compiler. When running on real quantum computers, the success rates of 3-qubit quantum phase estimation algorithm improve by 2.30X due to the reduced gate counts.
A quantum computer consists of a set of quantum bits upon which operations called gates are applied to perform computations. In order to perform quantum algorithms, physicists would like to design arbitrary gates to apply to quantum bits. However, the physical limitations of the quantum computing device restrict the set of gates that physicists are able to apply. Thus, they must compose a sequence of gates from the permitted gate set, which approximates the gate they wish to apply - a process called quantum compiling. Austin Fowler proposes a method that finds optimal gate sequences in exponential time, but which is tractable for common problems. In this paper, I present several optimizations to this algorithm. While my optimizations do not improve its overall exponential behavior, they improve its empirical performance by one to two orders of magnitude.
Quantum error correction is vital for implementing universal quantum computing. A key component is the encoding circuit that maps a product state of physical qubits into the encoded multipartite entangled logical state. Known methods are typically not optimal either in terms of the circuit depth (and therefore the error burden) or the specifics of the target platform, i.e. the native gates and topology of a system. This work introduces a variational compiler for efficiently finding the encoding circuit of general quantum error correcting codes with given quantum hardware. Focusing on the noisy intermediate scale quantum regime, we show how to systematically compile the circuit following an optimising process seeking to minimise the number of noisy operations that are allowed by the noisy quantum hardware or to obtain the highest fidelity of the encoded state with noisy gates. We demonstrate our method by deriving novel encoders for logic states of the five qubit code and the seven qubit Steane code. We describe ways to augment the discovered circuits with error detection. Our method is applicable quite generally for compiling the encoding circuits of quantum error correcting codes.
To bridge the gap between limited hardware access and the huge demand for experiments for Noisy Intermediate-Scale Quantum (NISQ) computing system study, a simulator which can capture the modeling of both the quantum processor and its classical control system to realize early-stage evaluation and design space exploration, is naturally invoked but still missing. This paper presents SANQ, a Simulation framework for Architecting NISQ computing system. SANQ consists of two components, 1) an optimized noisy quantum computing (QC) simulator with flexible error modeling accelerated by eliminating redundant computation, and 2) an architectural simulation infrastructure to construct behavior models for evaluating the control systems. SANQ is validated with existing NISQ quantum processor and control systems to ensure simulation accuracy. It can capture the variance on the QC device and simulate the timing behavior precisely (<1% and 10% error for various real control systems). Several potential applications are proposed to show that SANQ could benefit the future design of NISQ compiler, architecture, etc.
This article presents a powerful algorithmic framework for big data optimization, called the Block Successive Upper bound Minimization (BSUM). The BSUM includes as special cases many well-known methods for analyzing massive data sets, such as the Block Coordinate Descent (BCD), the Convex-Concave Procedure (CCCP), the Block Coordinate Proximal Gradient (BCPG) method, the Nonnegative Matrix Factorization (NMF), the Expectation Maximization (EM) method and so on. In this article, various features and properties of the BSUM are discussed from the viewpoint of design flexibility, computational efficiency, parallel/distributed implementation and the required communication overhead. Illustrative examples from networking, signal processing and machine learning are presented to demonstrate the practical performance of the BSUM framework