New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A Scalable Framework for Learning From Implicit User Feedback to Improve Natural Language Understanding in Large-Scale Conversational AI Systems

إطار قابل للتطوير للتعلم من ملاحظات المستخدم الضمنية لتحسين فهم اللغة الطبيعية في أنظمة منظمة العفو الدولية على نطاق واسع

261 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

أدب البحث الطبي صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request. We propose a scalable and automatic approach for improving NLU in a large-scale conversational AI system by leveraging implicit user feedback, with an insight that user interaction data and dialog context have rich information embedded from which user satisfaction and intention can be inferred. In particular, we propose a domain-agnostic framework for curating new supervision data for improving NLU from live production traffic. With an extensive set of experiments, we show the results of applying the framework and improving NLU for a large-scale production system across 10 domains.

References used

https://aclanthology.org/

rate research

Industry Scale Semi-Supervised Learning for Natural Language Understanding

273 - Association for Computation Linguistics 2021 مقالة

This paper presents a production Semi-Supervised Learning (SSL) pipeline based on the student-teacher framework, which leverages millions of unlabeled examples to improve Natural Language Understanding (NLU) tasks. We investigate two questions relate d to the use of unlabeled data in production SSL context: 1) how to select samples from a huge unlabeled data pool that are beneficial for SSL training, and 2) how does the selected data affect the performance of different state-of-the-art SSL techniques. We compare four widely used SSL techniques, Pseudo-label (PL), Knowledge Distillation (KD), Virtual Adversarial Training (VAT) and Cross-View Training (CVT) in conjunction with two data selection methods including committee-based selection and submodular optimization based selection. We further examine the benefits and drawbacks of these techniques when applied to intent classification (IC) and named entity recognition (NER) tasks, and provide guidelines specifying when each of these methods might be beneficial to improve large scale NLU systems.

تقييمات مدفوعة بالممارسة صناعة حمض الفوسفور

Large-Scale Contextualised Language Modelling for Norwegian

383 - Association for Computation Linguistics 2021 مقالة

We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience repo rt for data preparation and training. This paper introduces the first large-scale monolingual language models for Norwegian, based on both the ELMo and BERT frameworks. In addition to detailing the training process, we present contrastive benchmark results on a suite of NLP tasks for Norwegian. For additional background and access to the data, models, and software, please see: http://norlm.nlpl.eu

contextualised language modelling modelling for norwegian contextualised language models النمذجة اللغة السياقية النمذجة للنرويجية صناعة حمض الفوسفور

A Hybrid Approach to Scalable and Robust Spoken Language Understanding in Enterprise Virtual Agents

380 - Association for Computation Linguistics 2021 مقالة

Spoken language understanding (SLU) extracts the intended mean- ing from a user utterance and is a critical component of conversational virtual agents. In enterprise virtual agents (EVAs), language understanding is substantially challenging. First, t he users are infrequent callers who are unfamiliar with the expectations of a pre-designed conversation flow. Second, the users are paying customers of an enterprise who demand a reliable, consistent and efficient user experience when resolving their issues. In this work, we describe a general and robust framework for intent and entity extraction utilizing a hybrid of statistical and rule-based approaches. Our framework includes confidence modeling that incorporates information from all components in the SLU pipeline, a critical addition for EVAs to en- sure accuracy. Our focus is on creating accurate and scalable SLU that can be deployed rapidly for a large class of EVA applications with little need for human intervention.

مساعد الصوت enterprise virtual agents virtual agents الوكلاء الافتراضي للمؤسسات الوكلاء الافتراضية صناعة حمض الفوسفور

On User Interfaces for Large-Scale Document-Level Human Evaluation of Machine Translation Outputs

424 - Association for Computation Linguistics 2021 مقالة

Recent studies emphasize the need of document context in human evaluation of machine translations, but little research has been done on the impact of user interfaces on annotator productivity and the reliability of assessments. In this work, we compa re human assessment data from the last two WMT evaluation campaigns collected via two different methods for document-level evaluation. Our analysis shows that a document-centric approach to evaluation where the annotator is presented with the entire document context on a screen leads to higher quality segment and document level assessments. It improves the correlation between segment and document scores and increases inter-annotator agreement for document scores but is considerably more time consuming for annotators.

machine translation outputs translation outputs user interfaces نواتج الترجمة الآلية مخرجات الترجمة واجهات المستخدم صناعة حمض الفوسفور المزيد..

Crowdsourcing Natural Language Data at Scale: A Hands-On Tutorial

385 - Association for Computation Linguistics 2021 مقالة

In this tutorial, we present a portion of unique industry experience in efficient natural language data annotation via crowdsourcing shared by both leading researchers and engineers from Yandex. We will make an introduction to data labeling via publi c crowdsourcing marketplaces and will present the key components of efficient label collection. This will be followed by a practical session, where participants address a real-world language resource production task, experiment with selecting settings for the labeling process, and launch their label collection project on one of the largest crowdsourcing marketplaces. The projects will be run on real crowds within the tutorial session and we will present useful quality control techniques and provide the attendees with an opportunity to discuss their own annotation ideas.

natural language data crowdsourcing natural language بيانات اللغة الطبيعية growdsourcing اللغة الطبيعية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Scalable Framework for Learning From Implicit User Feedback to Improve Natural Language Understanding in Large-Scale Conversational AI Systems

إطار قابل للتطوير للتعلم من ملاحظات المستخدم الضمنية لتحسين فهم اللغة الطبيعية في أنظمة منظمة العفو الدولية على نطاق واسع

Ask ChatGPT about the research

Read More

suggested questions