أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Genta Indra Winata

Language Models are Few-shot Multilingual Learners

111 - Genta Indra Winata , Andrea Madotto , Zhaojiang Lin 2021

General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few ex amples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art cross-lingual models.

الحساب واللغة الذكاء الاصطناعي

Greenformer: Factorization Toolkit for Efficient Deep Neural Networks

444 - Samuel Cahyawijaya , Genta Indra Winata , Holy Lovenia 2021

While the recent advances in deep neural networks (DNN) bring remarkable success, the computational cost also increases considerably. In this paper, we introduce Greenformer, a toolkit to accelerate the computation of neural networks through matrix f actorization while maintaining performance. Greenformer can be easily applied with a single line of code to any DNN model. Our experimental results show that Greenformer is effective for a wide range of scenarios. We provide the showcase of Greenformer at https://samuelcahyawijaya.github.io/greenformer-demo/.

التعلم الآلي الذكاء الاصطناعي

X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented Compositional Semantic Parsing

96 - Zihan Liu , Genta Indra Winata , Peng Xu 2021

Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries and serves as an essential component of virtual assistants. Current TCSP models rely on numerous training data to achieve decent performance but fail to generaliz e to low-resource target languages or domains. In this paper, we present X2Parser, a transferable Cross-lingual and Cross-domain Parser for TCSP. Unlike previous models that learn to generate the hierarchical representations for nested intents and slots, we propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems. After that, we further propose a fertility-based slot predictor that first learns to dynamically detect the number of labels for each token, and then predicts the slot types. Experimental results illustrate that our model can significantly outperform existing strong baselines in cross-lingual and cross-domain settings, and our model can also achieve a good generalization ability on target languages of target domains. Furthermore, our model tackles the problem in an efficient non-autoregressive way that reduces the latency by up to 66% compared to the generative model.

الحساب واللغة الذكاء الاصطناعي

CAiRE in DialDoc21: Data Augmentation for Information-Seeking Dialogue System

136 - Etsuko Ishii , Yan Xu , Genta Indra Winata 2021

Information-seeking dialogue systems, including knowledge identification and response generation, aim to respond to users with fluent, coherent, and informative responses based on users needs, which. To tackle this challenge, we utilize data augmenta tion methods and several training techniques with the pre-trained language models to learn a general pattern of the task and thus achieve promising performance. In DialDoc21 competition, our system achieved 74.95 F1 score and 60.74 Exact Match score in subtask 1, and 37.72 SacreBLEU score in subtask 2. Empirical analysis is provided to explain the effectiveness of our approaches.

الحساب واللغة الذكاء الاصطناعي

BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling

107 - Zhaojiang Lin , Andrea Madotto , Genta Indra Winata 2021

Task-oriented dialogue (ToD) benchmarks provide an important avenue to measure progress and develop better conversational agents. However, existing datasets for end-to-end ToD modeling are limited to a single language, hindering the development of ro bust end-to-end ToD systems for multilingual countries and regions. Here we introduce BiToD, the first bilingual multi-domain dataset for end-to-end task-oriented dialogue modeling. BiToD contains over 7k multi-domain dialogues (144k utterances) with a large and realistic bilingual knowledge base. It serves as an effective benchmark for evaluating bilingual ToD systems and cross-lingual transfer learning approaches. We provide state-of-the-art baselines under three evaluation settings (monolingual, bilingual, and cross-lingual). The analysis of our baselines in different settings highlights 1) the effectiveness of training a bilingual ToD system compared to two independent monolingual ToD systems, and 2) the potential of leveraging a bilingual knowledge base and cross-lingual transfer learning to improve the system performance under low resource condition.

الحساب واللغة

ERICA: An Empathetic Android Companion for Covid-19 Quarantine

95 - Etsuko Ishii , Genta Indra Winata , Samuel Cahyawijaya 2021

Over the past year, research in various domains, including Natural Language Processing (NLP), has been accelerated to fight against the COVID-19 pandemic, yet such research has just started on dialogue systems. In this paper, we introduce an end-to-e nd dialogue system which aims to ease the isolation of people under self-quarantine. We conduct a control simulation experiment to assess the effects of the user interface, a web-based virtual agent called Nora vs. the android ERICA via a video call. The experimental results show that the android offers a more valuable user experience by giving the impression of being more empathetic and engaging in the conversation due to its nonverbal information, such as facial expressions and body gestures.

الحساب واللغة تفاعل الإنسان والحاسوب

Nora: The Well-Being Coach

68 - Genta Indra Winata , Holy Lovenia , Etsuko Ishii 2021

The current pandemic has forced people globally to remain in isolation and practice social distancing, which creates the need for a system to combat the resulting loneliness and negative emotions. In this paper we propose Nora, a virtual coaching pla tform designed to utilize natural language understanding in its dialogue system and suggest other recommendations based on user interactions. It is intended to provide assistance and companionship to people undergoing self-quarantine or work-from-home routines. Nora helps users gauge their well-being by detecting and recording the users emotion, sentiment, and stress. Nora also recommends various workout, meditation, or yoga exercises to users in support of developing a healthy daily routine. In addition, we provide a social community inside Nora, where users can connect and share their experiences with others undergoing a similar isolation procedure. Nora can be accessed from anywhere via a web link and has support for both English and Mandarin.

الحساب واللغة تفاعل الإنسان والحاسوب أنظمة الصوت في الحاسوب

Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation

149 - Zihan Liu , Genta Indra Winata , Pascale Fung 2021

The data scarcity in low-resource languages has become a bottleneck to building robust neural machine translation systems. Fine-tuning a multilingual pre-trained model (e.g., mBART (Liu et al., 2020)) on the translation task is a good approach for lo w-resource languages; however, its performance will be greatly limited when there are unseen languages in the translation pairs. In this paper, we present a continual pre-training (CPT) framework on mBART to effectively adapt it to unseen languages. We first construct noisy mixed-language text from the monolingual corpus of the target language in the translation pair to cover both the source and target languages, and then, we continue pre-training mBART to reconstruct the original monolingual text. Results show that our method can consistently improve the fine-tuning performance upon the mBART baseline, as well as other strong baselines, across all tested low-resource translation pairs containing unseen languages. Furthermore, our approach also boosts the performance on translation pairs where both languages are seen in the original mBARTs pre-training. The code is available at https://github.com/zliucr/cpt-nmt.

الحساب واللغة الذكاء الاصطناعي

Are Multilingual Models Effective in Code-Switching?

73 - Genta Indra Winata , Samuel Cahyawijaya , Zihan Liu 2021

Multilingual language models have shown decent performance in multilingual and cross-lingual natural language understanding tasks. However, the power of these multilingual models in code-switching tasks has not been fully explored. In this paper, we study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting by considering the inference speed, performance, and number of parameters to measure their practicality. We conduct experiments in three language pairs on named entity recognition and part-of-speech tagging and compare them with existing methods, such as using bilingual embeddings and multilingual meta-embeddings. Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching, while using meta-embeddings achieves similar results with significantly fewer parameters.

الحساب واللغة التعلم الآلي

Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition

101 - Genta Indra Winata , Guangsen Wang , Caiming Xiong 2020

One crucial challenge of real-world multilingual speech recognition is the long-tailed distribution problem, where some resource-rich languages like English have abundant training data, but a long tail of low-resource languages have varying amounts o f limited training data. To overcome the long-tail problem, in this paper, we propose Adapt-and-Adjust (A2), a transformer-based multi-task learning framework for end-to-end multilingual speech recognition. The A2 framework overcomes the long-tail problem via three techniques: (1) exploiting a pretrained multilingual language model (mBERT) to improve the performance of low-resource languages; (2) proposing dual adapters consisting of both language-specific and language-agnostic adaptation with minimal additional parameters; and (3) overcoming the class imbalance, either by imposing class priors in the loss during training or adjusting the logits of the softmax output during inference. Extensive experiments on the CommonVoice corpus show that A2 significantly outperforms conventional approaches.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد