Do you want to publish a course? Click here

BERT Cannot Align Characters

بيرت لا يمكن محاذاة الشخصيات

153   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

In previous work, it has been shown that BERT can adequately align cross-lingual sentences on the word level. Here we investigate whether BERT can also operate as a char-level aligner. The languages examined are English, Fake English, German and Greek. We show that the closer two languages are, the better BERT can align them on the character level. BERT indeed works well in English to Fake English alignment, but this does not generalize to natural languages to the same extent. Nevertheless, the proximity of two languages does seem to be a factor. English is more related to German than to Greek and this is reflected in how well BERT aligns them; English to German is better than English to Greek. We examine multiple setups and show that the similarity matrices for natural languages show weaker relations the further apart two languages are.

References used
https://aclanthology.org/
rate research

Read More

Existing pre-trained language models (PLMs) are often computationally expensive in inference, making them impractical in various resource-limited real-world applications. To address this issue, we propose a dynamic token reduction approach to acceler ate PLMs' inference, named TR-BERT, which could flexibly adapt the layer number of each token in inference to avoid redundant calculation. Specially, TR-BERT formulates the token reduction process as a multi-step token selection problem and automatically learns the selection strategy via reinforcement learning. The experimental results on several downstream NLP tasks show that TR-BERT is able to speed up BERT by 2-5 times to satisfy various performance demands. Moreover, TR-BERT can also achieve better performance with less computation in a suite of long-text tasks since its token-level layer number adaption greatly accelerates the self-attention operation in PLMs. The source code and experiment details of this paper can be obtained from https://github.com/thunlp/TR-BERT.
Latent alignment objectives such as CTC and AXE significantly improve non-autoregressive machine translation models. Can they improve autoregressive models as well? We explore the possibility of training autoregressive machine translation models with latent alignment objectives, and observe that, in practice, this approach results in degenerate models. We provide a theoretical explanation for these empirical results, and prove that latent alignment objectives are incompatible with teacher forcing.
The research aims at determine the gender and the features of the characters in children’s programs on CN. We chose an analytical descriptive approach that relies on content analysis as a research tool.To achieve that a content analysis form was ap plied on a sample of (295) programs. Results were as following: 1- The most common subjects in CN programs were the social subjects, and the fiction subjects. 2- The most common language was the classical Arabic. 3- Most programs on CN were foreign. 4- The most common gender was the male relatively (%53.81). 5- the most common features of male characters were: the fashionable look, happy and joy, strong and supernatural power. 6 - The most common features of female characterswere: the fashionable look, happy and joy, the self-confidence.
In this work, we propose a novel framework, Gradient Aligned Mutual Learning BERT (GAML-BERT), for improving the early exiting of BERT. GAML-BERT's contributions are two-fold. We conduct a set of pilot experiments, which shows that mutual knowledge d istillation between a shallow exit and a deep exit leads to better performances for both. From this observation, we use mutual learning to improve BERT's early exiting performances, that is, we ask each exit of a multi-exit BERT to distill knowledge from each other. Second, we propose GA, a novel training method that aligns the gradients from knowledge distillation to cross-entropy losses. Extensive experiments are conducted on the GLUE benchmark, which shows that our GAML-BERT can significantly outperform the state-of-the-art (SOTA) BERT early exiting methods.
Accurately dealing with any type of ambiguity is a major task in Natural Language Processing, with great advances recently reached due to the development of context dependent language models and the use of word or sentence embeddings. In this context , our work aimed at determining how the popular language representation model BERT handle ambiguity of nouns in grammatical number and gender in different languages. We show that models trained on one specific language achieve better results for the disambiguation process than multilingual models. Also, ambiguity is generally better dealt with in grammatical number than it is in grammatical gender, reaching greater distance values from one to another in direct comparisons of individual senses. The overall results show also that the amount of data needed for training monolingual models as well as application should not be underestimated.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا