Can You Put it All Together: Evaluating Conversational Agents Ability to Blend Skills

102 0 0.0 ( 0 )

Download Cite

Added by Eric Smith

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Eric Michael Smith - Mary Williamson - Kurt Shuster

Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent. Previous work has introduced tasks and datasets that aim to help agents to learn those qualities in isolation and gauge how well they can express them. But rather than being specialized in one single quality, a good open-domain conversational agent should be able to seamlessly blend them all into one cohesive conversational flow. In this work, we investigate several ways to combine models trained towards isolated capabilities, ranging from simple model aggregation schemes that require minimal additional training, to various forms of multi-task training that encompass several skills at all training stages. We further propose a new dataset, BlendedSkillTalk, to analyze how these capabilities would mesh together in a natural conversation, and compare the performance of different architectures and training schemes. Our experiments show that multi-tasking over several tasks that focus on particular capabilities results in better blended conversation performance compared to models trained on a single skill, and that both unified or two-stage approaches perform well if they are constructed to avoid unwanted bias in skill selection or are fine-tuned on our new task.

rate research

Can You be More Social? Injecting Politeness and Positivity into Task-Oriented Conversational Agents

92 - Yi-Chia Wang , Alexandros Papangelis , Runze Wang 2020

Goal-oriented conversational agents are becoming prevalent in our daily lives. For these systems to engage users and achieve their goals, they need to exhibit appropriate social behavior as well as provide informative replies that guide users through tasks. The first component of the research in this paper applies statistical modeling techniques to understand conversations between users and human agents for customer service. Analyses show that social language used by human agents is associated with greater users responsiveness and task completion. The second component of the research is the construction of a conversational agent model capable of injecting social language into an agents responses while still preserving content. The model uses a sequence-to-sequence deep learning architecture, extended with a social language understanding element. Evaluation in terms of content preservation and social language level using both human judgment and automatic linguistic measures shows that the model can generate responses that enable agents to address users issues in a more socially appropriate way.

Computation and Language Human-Computer Interaction

Attention Can Reflect Syntactic Structure (If You Let It)

114 - Vinit Ravishankar , Artur Kulmizev , Mostafa Abdou 2021

Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English -- a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. We show that full trees can be decoded above baseline accuracy from single attention heads, and that individual relations are often tracked by the same heads across languages. Furthermore, in an attempt to address recent debates about the status of attention as an explanatory mechanism, we experiment with fine-tuning mBERT on a supervised parsing objective while freezing different series of parameters. Interestingly, in steering the objective to learn explicit linguistic structure, we find much of the same structure represented in the resulting attention patterns, with interesting differences with respect to which parameters are frozen.

Computation and Language

Fastformer: Additive Attention Can Be All You Need

91 - Chuhan Wu , Fangzhao Wu , Tao Qi 2021

Transformer is a powerful model for text understanding. However, it is inefficient due to its quadratic complexity to input sequence length. Although there are many methods on Transformer acceleration, they are still either inefficient on long sequences or not effective enough. In this paper, we propose Fastformer, which is an efficient Transformer model based on additive attention. In Fastformer, instead of modeling the pair-wise interactions between tokens, we first use additive attention mechanism to model global contexts, and then further transform each token representation based on its interaction with global context representations. In this way, Fastformer can achieve effective context modeling with linear complexity. Extensive experiments on five datasets show that Fastformer is much more efficient than many existing Transformer models and can meanwhile achieve comparable or even better long text modeling performance.

Computation and Language

If you like C/O variations, you should have put a ring on it

131 - Nienke van der Marel 2021

The C/O-ratio as traced with C$_2$H emission in protoplanetary disks is fundamental for constraining the formation mechanisms of exoplanets and our understanding of volatile depletion in disks, but current C$_2$H observations show an apparent bimodal distribution which is not well understood, indicating that the C/O distribution is not described by a simple radial dependence. The transport of icy pebbles has been suggested to alter the local elemental abundances in protoplanetary disks, through settling, drift and trapping in pressure bumps resulting in a depletion of volatiles in the surface and an increase of the elemental C/O. We combine all disks with spatially resolved ALMA C$_2$H observations with high-resolution continuum images and constraints on the CO snowline to determine if the C$_2$H emission is indeed related to the location of the icy pebbles. We report a possible correlation between the presence of a significant CO-icy dust reservoir and high C$_2$H emission, which is only found in disks with dust rings outside the CO snowline. In contrast, compact dust disks (without pressure bumps) and warm transition disks (with their dust ring inside the CO snowline) are not detected in C$_2$H, suggesting that such disks may never have contained a significant CO ice reservoir. This correlation provides evidence for the regulation of the C/O profile by the complex interplay of CO snowline and pressure bump locations in the disk. These results demonstrate the importance of including dust transport in chemical disk models, for a proper interpretation of exoplanet atmospheric compositions, and a better understanding of volatile depletion in disks, in particular the use of CO isotopologues to determine gas surface densities.

Earth and Planetary Astrophysics

Conversational Document Prediction to Assist Customer Care Agents

341 - Jatin Ganhotra , Haggai Roitman , Doron Cohen 2020

A frequent pattern in customer care conversations is the agents responding with appropriate webpage URLs that address users needs. We study the task of predicting the documents that customer care agents can use to facilitate users needs. We also introduce a new public dataset which supports the aforementioned problem. Using this dataset and two others, we investigate state-of-the art deep learning (DL) and information retrieval (IR) models for the task. Additionally, we analyze the practicality of such systems in terms of inference time complexity. Our show that an hybrid IR+DL approach provides the best of both worlds.

Computation and Language