No Arabic abstract
We optimize the area and latency of Shors factoring while simultaneously improving fault tolerance through: (1) balancing the use of ancilla generators, (2) aggressive optimization of error correction, and (3) tuning the core adder circuits. Our custom CAD flow produces detailed layouts of the physical components and utilizes simulation to analyze circuits in terms of area, latency, and success probability. We introduce a metric, called ADCR, which is the probabilistic equivalent of the classic Area-Delay product. Our error correction optimization can reduce ADCR by an order of magnitude or more. Contrary to conventional wisdom, we show that the area of an optimized quantum circuit is not dominated exclusively by error correction. Further, our adder evaluation shows that quantum carry-lookahead adders (QCLA) beat ripple-carry adders in ADCR, despite being larger and more complex. We conclude with what we believe is one of most accurate estimates of the area and latency required for 1024-bit Shors factorization: 7659 mm$^{2}$ for the smallest circuit and $6 * 10^8$ seconds for the fastest circuit.
The quantum multicomputer consists of a large number of small nodes and a qubus interconnect for creating entangled state between the nodes. The primary metric chosen is the performance of such a system on Shors algorithm for factoring large numbers: specifically, the quantum modular exponentiation step that is the computational bottleneck. This dissertation introduces a number of optimizations for the modular exponentiation. My algorithms reduce the latency, or circuit depth, to complete the modular exponentiation of an n-bit number from O(n^3) to O(n log^2 n) or O(n^2 log n), depending on architecture. Calculations show that these algorithms are one million times and thirteen thousand times faster, when factoring a 6,000-bit number, depending on architecture. Extending to the quantum multicomputer, five different qubus interconnect topologies are considered, and two forms of carry-ripple adder are found to be the fastest for a wide range of performance parameters. The links in the quantum multicomputer are serial; parallel links would provide only very modest improvements in system reliability and performance. Two levels of the Steane [[23,1,7]] error correction code will adequately protect our data for factoring a 1,024-bit number even when the qubit teleportation failure rate is one percent.
Quantum information processing and its associated technologies has reached an interesting and timely stage in their development where many different experiments have been performed establishing the basic building blocks. The challenge moving forward is to scale up to larger sized quantum machines capable of performing tasks not possible today. This raises a number of interesting questions like: How big will these machines need to be? how many resources will they consume? This needs to be urgently addressed. Here we estimate the resources required to execute Shors factoring algorithm on a distributed atom-optics quantum computer architecture. We determine the runtime and requisite size of the quantum computer as a function of the problem size and physical error rate. Our results suggest that once experimental accuracy reaches levels below the fault-tolerant threshold, further optimisation of computational performance and resources is largely an issue of how the algorithm and circuits are implemented, rather than the physical quantum hardware
We study the results of a compiled version of Shors factoring algorithm on the ibmqx5 superconducting chip, for the particular case of $N=15$, $21$ and $35$. The semi-classical quantum Fourier transform is used to implement the algorithm with only a small number of physical qubits and the circuits are designed to reduce the number of gates to the minimum. We use the square of the statistical overlap to give a quantitative measure of the similarity between the experimentally obtained distribution of phases and the predicted theoretical distribution one for different values of the period. This allows us to assign a period to the experimental data without the use of the continued fraction algorithm. A quantitative estimate of the error in our assignment of the period is then given by the overlap coefficient.
We report a proof-of-concept demonstration of a quantum order-finding algorithm for factoring the integer 21. Our demonstration involves the use of a compiled version of the quantum phase estimation routine, and builds upon a previous demonstration by Martin-L{o}pez et al. in Nature Photonics 6, 773 (2012). We go beyond this work by using a configuration of approximate Toffoli gates with residual phase shifts, which preserves the functional correctness and allows us to achieve a complete factoring of N=21. We implemented the algorithm on IBM quantum processors using only 5 qubits and successfully verified the presence of entanglement between the control and work register qubits, which is a necessary condition for the algorithms speedup in general. The techniques we employ may be useful in carrying out Shors algorithm for larger integers, or other algorithms in systems with a limited number of noisy qubits.
Quantum computation promises significant computational advantages over classical computation for some problems. However, quantum hardware suffers from much higher error rates than in classical hardware. As a result, extensive quantum error correction is required to execute a useful quantum algorithm. The decoder is a key component of the error correction scheme whose role is to identify errors faster than they accumulate in the quantum computer and that must be implemented with minimum hardware resources in order to scale to the regime of practical applications. In this work, we consider surface code error correction, which is the most popular family of error correcting codes for quantum computing, and we design a decoder micro-architecture for the Union-Find decoding algorithm. We propose a three-stage fully pipelined hardware implementation of the decoder that significantly speeds up the decoder. Then, we optimize the amount of decoding hardware required to perform error correction simultaneously over all the logical qubits of the quantum computer. By sharing resources between logical qubits, we obtain a 67% reduction of the number of hardware units and the memory capacity is reduced by 70%. Moreover, we reduce the bandwidth required for the decoding process by a factor at least 30x using low-overhead compression algorithms. Finally, we provide numerical evidence that our optimized micro-architecture can be executed fast enough to correct errors in a quantum computer.