ترغب بنشر مسار تعليمي؟ اضغط هنا

Position representation is crucial for building position-aware representations in Transformers. Existing position representations suffer from a lack of generalization to test data with unseen lengths or high computational cost. We investigate shifted absolute position embedding (SHAPE) to address both issues. The basic idea of SHAPE is to achieve shift invariance, which is a key property of recent successful position representations, by randomly shifting absolute positions during training. We demonstrate that SHAPE is empirically comparable to its counterpart while being simpler and faster.
72 - Jun Suzuki 2020
In this paper, we study the quantum-state estimation problem in the framework of optimal design of experiments. We first find the optimal designs about arbitrary qubit models for popular optimality criteria such as A-, D-, and E-optimal designs. We a lso give the one-parameter family of optimality criteria which includes these criteria. We then extend a classical result in the design problem, the Kiefer-Wolfowitz theorem, to a qubit system showing the D-optimal design is equivalent to a certain type of the A-optimal design. We next compare and analyze several optimal designs based on the efficiency. We explicitly demonstrate that an optimal design for a certain criterion can be highly inefficient for other optimality criteria.
Understanding the influence of a training instance on a neural network model leads to improving interpretability. However, it is difficult and inefficient to evaluate the influence, which shows how a models prediction would be changed if a training i nstance were not used. In this paper, we propose an efficient method for estimating the influence. Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance. By switching between dropout masks, we can use sub-networks that learned or did not learn each training instance and estimate its influence. Through experiments with BERT and VGGNet on classification datasets, we demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.
188 - Shin Funada , Jun Suzuki 2020
We investigate whether a trade-off relation between the diagonal elements of the mean square error matrix exists for the two-parameter unitary models with mutually commuting generators. We show that the error trade-off relation which exists in our mo dels of a finite dimension system is a generic phenomenon in the sense that it occurs with a finite volume in the spate space. We analyze a qutrit system to show that there can be an error trade-off relation given by the SLD and RLD Cramer-Rao bounds that intersect each other. First, we analyze an example of the reference state showing the non-trivial trade-off relation numerically, and find that its eigenvalues must be in a certain range to exhibit the trade-off relation. For another example, one-parameter family of reference states, we analytically show that the non-trivial relation always exists and that the range where the trade-off relation exists is up to about a half of the possible range.
Finding the optimal attainable precisions in quantum multiparameter metrology is a non trivial problem. One approach to tackling this problem involves the computation of bounds which impose limits on how accurately we can estimate certain physical qu antities. One such bound is the Holevo Cramer Rao bound on the trace of the mean squared error matrix. The Holevo bound is an asymptotically achievable bound when one allows for any measurement strategy, including collective measurements on many copies of the probe. In this work we introduce a tighter bound for estimating multiple parameters simultaneously when performing separable measurements on finite copies of the probe. This makes it more relevant in terms of experimental accessibility. We show that this bound can be efficiently computed by casting it as a semidefinite program. We illustrate our bound with several examples of collective measurements on finite copies of the probe. These results have implications for the necessary requirements to saturate the Holevo bound.
In this paper, we investigate the problem of estimating the phase of a coherent state in the presence of unavoidable noisy quantum states. These unwarranted quantum states are represented by outlier quantum states in this study. We first present a st atistical framework of robust statistics in a quantum system to handle outlier quantum states. We then apply the method of M-estimators to suppress untrusted measurement outcomes due to outlier quantum states. Our proposal has the advantage over the classical methods in being systematic, easy to implement, and robust against occurrence of noisy states.
Interpretable rationales for model predictions play a critical role in practical applications. In this study, we develop models possessing interpretable inference process for structured prediction. Specifically, we present a method of instance-based learning that learns similarities between spans. At inference time, each span is assigned a class label based on its similar spans in the training set, where it is easy to understand how much each training instance contributes to the predictions. Through empirical analysis on named entity recognition, we demonstrate that our method enables to build models that have high interpretability without sacrificing performance.
Large-scale dialogue datasets have recently become available for training neural dialogue agents. However, these datasets have been reported to contain a non-negligible number of unacceptable utterance pairs. In this paper, we propose a method for sc oring the quality of utterance pairs in terms of their connectivity and relatedness. The proposed scoring method is designed based on findings widely shared in the dialogue and linguistics research communities. We demonstrate that it has a relatively good correlation with the human judgment of dialogue quality. Furthermore, the method is applied to filter out potentially unacceptable utterance pairs from a large-scale noisy dialogue corpus to ensure its quality. We experimentally confirm that training data filtered by the proposed method improves the quality of neural dialogue agents in response generation.
The incorporation of pseudo data in the training of grammatical error correction models has been one of the main factors in improving the performance of such models. However, consensus is lacking on experimental configurations, namely, choosing how t he pseudo data should be generated or used. In this study, these choices are investigated through extensive experiments, and state-of-the-art performance is achieved on the CoNLL-2014 test set ($F_{0.5}=65.0$) and the official test set of the BEA-2019 shared task ($F_{0.5}=70.2$) without making any modifications to the model architecture.
65 - Shin Funada , Jun Suzuki 2019
We investigate the uncertainty relation for estimating the position of one electron in a uniform magnetic field in the framework of the quantum estimation theory. Two kinds of momenta, canonical one and mechanical one, are used to generate a shift in the position of the electron. We first consider pure state models whose wave function is in the ground state with zero angular momentum. The model generated by the two-commuting canonical momenta becomes the quasi-classical model, in which the symmetric logarithmic derivative quantum Cramer-Rao bound is achievable. The model generated by the two non-commuting mechanical momenta, on the other hand, turns out to be a Gaussian model, where the generalized right logarithmic derivative quantum Cramer-Rao bound is achievable. We next consider mixed-state models by taking into account the effects of thermal noise. The model with the canonical momenta now becomes genuine quantum mechanical, although its generators commute with each other. The derived uncertainty relationship is in general determined by two different quantum Cramer-Rao bounds in a non-trivial manner. The model with the mechanical momenta is identified with the well-known Gaussian shift model, and the uncertainty relation is governed by the right logarithmic derivative Cramer-Rao bound.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا