ترغب بنشر مسار تعليمي؟ اضغط هنا

As current Noisy Intermediate Scale Quantum (NISQ) devices suffer from decoherence errors, any delay in the instruction execution of quantum control microarchitecture can lead to the loss of quantum information and incorrect computation results. Henc e, it is crucial for the control microarchitecture to issue quantum operations to the Quantum Processing Unit (QPU) in time. As in classical microarchitecture, parallelism in quantum programs needs to be exploited for speedup. However, three challenges emerge in the quantum scenario: 1) the quantum feedback control can introduce significant pipeline stall latency; 2) timing control is required for all quantum operations; 3) QPU requires a deterministic operation supply to prevent the accumulation of quantum errors. In this paper, we propose a novel control microarchitecture design to exploit Circuit Level Parallelism (CLP) and Quantum Operation Level Parallelism (QOLP). Firstly, we develop a Multiprocessor architecture to exploit CLP, which supports dynamic scheduling of different sub-circuits. This architecture can handle parallel feedback control and minimize the potential overhead that disrupts the timing control. Secondly, we propose a Quantum Superscalar approach that exploits QOLP by efficiently executing massive quantum instructions in parallel. Both methods issue quantum operations to QPU deterministically. In the benchmark test of a Shor syndrome measurement, a six-core implementation of our proposal achieves up to 2.59$times$ speedup compared with a single core. For various canonical quantum computing algorithms, our superscalar approach achieves an average of 4.04$times$ improvement over a baseline design. Finally, We perform a simultaneous randomized benchmarking (simRB) experiment on a real QPU using the proposed microarchitecture for validation.
270 - Hongqiang Du , Lei Xie 2021
One-shot voice conversion has received significant attention since only one utterance from source speaker and target speaker respectively is required. Moreover, source speaker and target speaker do not need to be seen during training. However, availa ble one-shot voice conversion approaches are not stable for unseen speakers as the speaker embedding extracted from one utterance of an unseen speaker is not reliable. In this paper, we propose a deep discriminative speaker encoder to extract speaker embedding from one utterance more effectively. Specifically, the speaker encoder first integrates residual network and squeeze-and-excitation network to extract discriminative speaker information in frame level by modeling frame-wise and channel-wise interdependence in features. Then attention mechanism is introduced to further emphasize speaker related information via assigning different weights to frame level speaker information. Finally a statistic pooling layer is used to aggregate weighted frame level speaker information to form utterance level speaker embedding. The experimental results demonstrate that our proposed speaker encoder can improve the robustness of one-shot voice conversion for unseen speakers and outperforms baseline systems in terms of speech quality and speaker similarity.
84 - Xiong Wang , Sining Sun , Lei Xie 2021
End-to-end models are favored in automatic speech recognition (ASR) because of their simplified system structure and superior performance. Among these models, Transformer and Conformer have achieved state-of-the-art recognition accuracy in which self -attention plays a vital role in capturing important global information. However, the time and memory complexity of self-attention increases squarely with the length of the sentence. In this paper, a prob-sparse self-attention mechanism is introduced into Conformer to sparse the computing process of self-attention in order to accelerate inference speed and reduce space consumption. Specifically, we adopt a Kullback-Leibler divergence based sparsity measurement for each query to decide whether we compute the attention function on this query. By using the prob-sparse attention mechanism, we achieve impressively 8% to 45% inference speed-up and 15% to 45% memory usage reduction of the self-attention module of Conformer Transducer while maintaining the same level of error rate.
107 - Lei Xie , Zishu He , Jun Tong 2021
This paper considers the regularized estimation of covariance matrices (CM) of high-dimensional (compound) Gaussian data for minimum variance distortionless response (MVDR) beamforming. Linear shrinkage is applied to improve the accuracy and conditio n number of the CM estimate for low-sample-support cases. We focus on data-driven techniques that automatically choose the linear shrinkage factors for shrinkage sample covariance matrix ($text{S}^2$CM) and shrinkage Tylers estimator (STE) by exploiting cross validation (CV). We propose leave-one-out cross-validation (LOOCV) choices for the shrinkage factors to optimize the beamforming performance, referred to as $text{S}^2$CM-CV and STE-CV. The (weighted) out-of-sample output power of the beamfomer is chosen as a proxy of the beamformer performance and concise expressions of the LOOCV cost function are derived to allow fast optimization. For the large system regime, asymptotic approximations of the LOOCV cost functions are derived, yielding the $text{S}^2$CM-AE and STE-AE. In general, the proposed algorithms are able to achieve near-oracle performance in choosing the linear shrinkage factors for MVDR beamforming. Simulation results are provided for validating the proposed methods.
68 - Lei Xie , Zishu He , Jun Tong 2021
This paper investigates regularized estimation of Kronecker-structured covariance matrices (CM) for complex elliptically symmetric (CES) data. To obtain a well-conditioned estimate of the CM, we add penalty terms of Kullback-Leibler divergence to the negative log-likelihood function of the associated complex angular Gaussian (CAG) distribution. This is shown to be equivalent to regularizing Tylers fixed-point equations by shrinkage. A sufficient condition that the solution exists is discussed. An iterative algorithm is applied to solve the resulting fixed-point iterations and its convergence is proved. In order to solve the critical problem of tuning the shrinkage factors, we then introduce three methods by exploiting oracle approximating shrinkage (OAS) and cross-validation (CV). When the training samples are limited, the proposed estimator, referred to as the robust shrinkage Kronecker estimator (RSKE), has better performance compared with several existing methods. Simulations are conducted for validating the proposed estimator and demonstrating its high performance.
We propose a novel training scheme to optimize voice conversion network with a speaker identity loss function. The training scheme not only minimizes frame-level spectral loss, but also speaker identity loss. We introduce a cycle consistency loss tha t constrains the converted speech to maintain the same speaker identity as reference speech at utterance level. While the proposed training scheme is applicable to any voice conversion networks, we formulate the study under the average model voice conversion framework in this paper. Experiments conducted on CMU-ARCTIC and CSTR-VCTK corpus confirm that the proposed method outperforms baseline methods in terms of speaker similarity.
120 - Shan Yang , Lei Xie , Xiao Chen 2017
In this paper, we aim at improving the performance of synthesized speech in statistical parametric speech synthesis (SPSS) based on a generative adversarial network (GAN). In particular, we propose a novel architecture combining the traditional acous tic loss function and the GANs discriminative loss under a multi-task learning (MTL) framework. The mean squared error (MSE) is usually used to estimate the parameters of deep neural networks, which only considers the numerical difference between the raw audio and the synthesized one. To mitigate this problem, we introduce the GAN as a second task to determine if the input is a natural speech with specific conditions. In this MTL framework, the MSE optimization improves the stability of GAN, and at the same time GAN produces samples with a distribution closer to natural speech. Listening tests show that the multi-task architecture can generate more natural speech that satisfies human perception than the conventional methods.
A vast variety of real-life networks display the ubiquitous presence of scale-free phenomenon and small-world effect, both of which play a significant role in the dynamical processes running on networks. Although various dynamical processes have been investigated in scale-free small-world networks, analytical research about random walks on such networks is much less. In this paper, we will study analytically the scaling of the mean first-passage time (MFPT) for random walks on scale-free small-world networks. To this end, we first map the classical Koch fractal to a network, called Koch network. According to this proposed mapping, we present an iterative algorithm for generating the Koch network, based on which we derive closed-form expressions for the relevant topological features, such as degree distribution, clustering coefficient, average path length, and degree correlations. The obtained solutions show that the Koch network exhibits scale-free behavior and small-world effect. Then, we investigate the standard random walks and trapping issue on the Koch network. Through the recurrence relations derived from the structure of the Koch network, we obtain the exact scaling for the MFPT. We show that in the infinite network order limit, the MFPT grows linearly with the number of all nodes in the network. The obtained analytical results are corroborated by direct extensive numerical calculations. In addition, we also determine the scaling efficiency exponents characterizing random walks on the Koch network.
It is known that the heterogeneity of scale-free networks helps enhancing the efficiency of trapping processes performed on them. In this paper, we show that transport efficiency is much lower in a fractal scale-free network than in non-fractal netwo rks. To this end, we examine a simple random walk with a fixed trap at a given position on a fractal scale-free network. We calculate analytically the mean first-passage time (MFPT) as a measure of the efficiency for the trapping process, and obtain a closed-form expression for MFPT, which agrees with direct numerical calculations. We find that, in the limit of a large network order $V$, the MFPT $<T>$ behaves superlinearly as $<T > sim V^{{3/2}}$ with an exponent 3/2 much larger than 1, which is in sharp contrast to the scaling $<T > sim V^{theta}$ with $theta leq 1$, previously obtained for non-fractal scale-free networks. Our results indicate that the degree distribution of scale-free networks is not sufficient to characterize trapping processes taking place on them. Since various real-world networks are simultaneously scale-free and fractal, our results may shed light on the understanding of trapping processes running on real-life systems.
Explicit determination of the mean first-passage time (MFPT) for trapping problem on complex media is a theoretical challenge. In this paper, we study random walks on the Apollonian network with a trap fixed at a given hub node (i.e. node with the hi ghest degree), which are simultaneously scale-free and small-world. We obtain the precise analytic expression for the MFPT that is confirmed by direct numerical calculations. In the large system size limit, the MFPT approximately grows as a power-law function of the number of nodes, with the exponent much less than 1, which is significantly different from the scaling for some regular networks or fractals, such as regular lattices, Sierpinski fractals, T-graph, and complete graphs. The Apollonian network is the most efficient configuration for transport by diffusion among all previously studied structure.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا