ترغب بنشر مسار تعليمي؟ اضغط هنا

Non-autoregressive (NAR) transformer models have been studied intensively in automatic speech recognition (ASR), and a substantial part of NAR transformer models is to use the casual mask to limit token dependencies. However, the casual mask is desig ned for the left-to-right decoding process of the non-parallel autoregressive (AR) transformer, which is inappropriate for the parallel NAR transformer since it ignores the right-to-left contexts. Some models are proposed to utilize right-to-left contexts with an extra decoder, but these methods increase the model complexity. To tackle the above problems, we propose a new non-autoregressive transformer with a unified bidirectional decoder (NAT-UBD), which can simultaneously utilize left-to-right and right-to-left contexts. However, direct use of bidirectional contexts will cause information leakage, which means the decoder output can be affected by the character information from the input of the same position. To avoid information leakage, we propose a novel attention mask and modify vanilla queries, keys, and values matrices for NAT-UBD. Experimental results verify that NAT-UBD can achieve character error rates (CERs) of 5.0%/5.5% on the Aishell1 dev/test sets, outperforming all previous NAR transformer models. Moreover, NAT-UBD can run 49.8x faster than the AR transformer baseline when decoding in a single step.
108 - Xuanting Ji , Yan Liu , Ya-Wen Sun 2021
We present effective field theories for the weakly coupled Weyl-$mathrm{Z}_2$ semimetal, as well as the holographic realization for the strongly coupled case. In both cases, the anomalous systems have both the chiral anomaly and the $mathrm{Z}_2$ ano maly and possess topological quantum phase transitions from the Weyl-$mathrm{Z}_2$ semimetal phases to partly or fully topological trivial phases. We find that the topological phase transition is characterized by the anomalous transport parameters, i.e. the anomalous Hall conductivity and the $mathrm{Z}_2$ anomalous Hall conductivity. These two parameters are nonzero at the Weyl-$mathrm{Z}_2$ semimetal phase and vanish at the topologically trivial phases. In the holographic case, the different behavior between the two anomalous transport coefficients is discussed. Our work reveals the novel phase structure of strongly interacting Weyl-$mathrm{Z}_2$ semimetal with two pairs of nodes.
We present new estimators for the statistical analysis of the dependence of the mean gap time length between consecutive recurrent events, on a set of explanatory random variables and in the presence of right censoring. The dependence is expressed th rough regression-like and overdispersion parameters, estimated via conditional estimating equations. The mean and variance of the length of each gap time, conditioned on the observed history of prior events and other covariates, are known functions of parameters and covariates. Under certain conditions on censoring, we construct normalized estimating functions that are asymptotically unbiased and contain only observed data. We discuss the existence, consistency and asymptotic normality of a sequence of estimators of the parameters, which are roots of these estimating equations. Simulations suggest that our estimators could be used successfully with a relatively small sample size in a study of short duration.
The data scarcity problem in Electroencephalography (EEG) based affective computing results into difficulty in building an effective model with high accuracy and stability using machine learning algorithms especially deep learning models. Data augmen tation has recently achieved considerable performance improvement for deep learning models: increased accuracy, stability, and reduced over-fitting. In this paper, we propose a novel data augmentation framework, namely Generative Adversarial Network-based Self-supervised Data Augmentation (GANSER). As the first to combine adversarial training with self-supervised learning for EEG-based emotion recognition, the proposed framework can generate high-quality and high-diversity simulated EEG samples. In particular, we utilize adversarial training to learn an EEG generator and force the generated EEG signals to approximate the distribution of real samples, ensuring the quality of augmented samples. A transformation function is employed to mask parts of EEG signals and force the generator to synthesize potential EEG signals based on the remaining parts, to produce a wide variety of samples. The masking possibility during transformation is introduced as prior knowledge to guide to extract distinguishable features for simulated EEG signals and generalize the classifier to the augmented sample space. Finally, extensive experiments demonstrate our proposed method can help emotion recognition for performance gain and achieve state-of-the-art results.
We consider the sparse principal component analysis for high-dimensional stationary processes. The standard principal component analysis performs poorly when the dimension of the process is large. We establish the oracle inequalities for penalized pr incipal component estimators for the processes including heavy-tailed time series. The rate of convergence of the estimators is established. We also elucidate the theoretical rate for choosing the tuning parameter in penalized estimators. The performance of the sparse principal component analysis is demonstrated by numerical simulations. The utility of the sparse principal component analysis for time series data is exemplified by the application to average temperature data.
The existing segmentation techniques require high-fidelity images as input to perform semantic segmentation. Since the segmentation results contain most of edge information that is much less than the acquired images, the throughput gap leads to both hardware and software waste. In this letter, we report an image-free single-pixel segmentation technique. The technique combines structured illumination and single-pixel detection together, to efficiently samples and multiplexes scenes segmentation information into compressed one-dimensional measurements. The illumination patterns are optimized together with the subsequent reconstruction neural network, which directly infers segmentation maps from the single-pixel measurements. The end-to-end encoding-and-decoding learning framework enables optimized illumination with corresponding network, which provides both high acquisition and segmentation efficiency. Both simulation and experimental results validate that accurate segmentation can be achieved using two-order-of-magnitude less input data. When the sampling ratio is 1%, the Dice coefficient reaches above 80% and the pixel accuracy reaches above 96%. We envision that this image-free segmentation technique can be widely applied in various resource-limited platforms such as UAV and unmanned vehicle that require real-time sensing.
Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agents experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.
Many scientific conferences employ a two-phase paper review process, where some papers are assigned additional reviewers after the initial reviews are submitted. Many conferences also design and run experiments on their paper review process, where so me papers are assigned reviewers who provide reviews under an experimental condition. In this paper, we consider the question: how should reviewers be divided between phases or conditions in order to maximize total assignment similarity? We make several contributions towards answering this question. First, we prove that when the set of papers requiring additional review is unknown, a simplified variant of this problem is NP-hard. Second, we empirically show that across several datasets pertaining to real conference data, dividing reviewers between phases/conditions uniformly at random allows an assignment that is nearly as good as the oracle optimal assignment. This uniformly random choice is practical for both the two-phase and conference experiment design settings. Third, we provide explanations of this phenomenon by providing theoretical bounds on the suboptimality of this random strategy under certain natural conditions. From these easily-interpretable conditions, we provide actionable insights to conference program chairs about whether a random reviewer split is suitable for their conference.
We study the behavior of black hole singularities across the Hawking-Page phase transitions, uncovering the possible connection between the physics inside and outside the horizon. We focus on the case of spacelike singularities in Einstein-scalar the ory which are of the Kasner form. We find that the Kasner exponents are continuous and non-differentiable during the second order phase transitions, while discontinuous in the first order phase transitions. We give some arguments on the universality of this behavior. We also discuss possible observables in the dual field theory which encode the Kasner exponents.
78 - Siyan Liu , Pei Zhang , Dan Lu 2021
We propose a novel prediction interval method to learn prediction mean values, lower and upper bounds of prediction intervals from three independently trained neural networks only using the standard mean squared error (MSE) loss, for uncertainty quan tification in regression tasks. Our method requires no distributional assumption on data, does not introduce unusual hyperparameters to either the neural network models or the loss function. Moreover, our method can effectively identify out-of-distribution samples and reasonably quantify their uncertainty. Numerical experiments on benchmark regression problems show that our method outperforms the state-of-the-art methods with respect to predictive uncertainty quality, robustness, and identification of out-of-distribution samples.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا