PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback

373 0 0.0 ( 0 )

Download Cite

Added by Tianyi Ma

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Yaohua Bu - Tianyi Ma - Weijun Li

Human-Computer Interaction

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Second language (L2) English learners often find it difficult to improve their pronunciations due to the lack of expressive and personalized corrective feedback. In this paper, we present Pronunciation Teacher (PTeacher), a Computer-Aided Pronunciation Training (CAPT) system that provides personalized exaggerated audio-visual corrective feedback for mispronunciations. Though the effectiveness of exaggerated feedback has been demonstrated, it is still unclear how to define the appropriate degrees of exaggeration when interacting with individual learners. To fill in this gap, we interview 100 L2 English learners and 22 professional native teachers to understand their needs and experiences. Three critical metrics are proposed for both learners and teachers to identify the best exaggeration levels in both audio and visual modalities. Additionally, we incorporate the personalized dynamic feedback mechanism given the English proficiency of learners. Based on the obtained insights, a comprehensive interactive pronunciation training course is designed to help L2 learners rectify mispronunciations in a more perceptible, understandable, and discriminative manner. Extensive user studies demonstrate that our system significantly promotes the learners learning efficiency.

rate research

Visual-speech Synthesis of Exaggerated Corrective Feedback

152 - Yaohua Bu , Weijun Li , Tianyi Ma 2020

To provide more discriminative feedback for the second language (L2) learners to better identify their mispronunciation, we propose a method for exaggerated visual-speech feedback in computer-assisted pronunciation training (CAPT). The speech exaggeration is realized by an emphatic speech generation neural network based on Tacotron, while the visual exaggeration is accomplished by ADC Viseme Blending, namely increasing Amplitude of movement, extending the phones Duration and enhancing the color Contrast. User studies show that exaggerated feedback outperforms non-exaggerated version on helping learners with pronunciation identification and pronunciation improvement.

Audio and Speech Processing Artificial Intelligence

Bringing personalized learning into computer-aided question generation

144 - Yi-Ting Huang , Meng Chang Chen , 2018

This paper proposes a novel and statistical method of ability estimation based on acquisition distribution for a personalized computer aided question generation. This method captures the learning outcomes over time and provides a flexible measurement based on the acquisition distributions instead of precalibration. Compared to the previous studies, the proposed method is robust, especially when an ability of a student is unknown. The results from the empirical data show that the estimated abilities match the actual abilities of learners, and the pretest and post-test of the experimental group show significant improvement. These results suggest that this method can serves as the ability estimation for a personalized computer-aided testing environment.

Human-Computer Interaction Artificial Intelligence

A Technology-aided Multi-modal Training Approach to Assist Abdominal Palpation Training and its Assessment in Medical Education

71 - A. Asadipour , K. Debattista , V. Patel 2020

Computer-assisted multimodal training is an effective way of learning complex motor skills in various applications. In particular disciplines (eg. healthcare) incompetency in performing dexterous hands-on examinations (clinical palpation) may result in misdiagnosis of symptoms, serious injuries or even death. Furthermore, a high quality clinical examination can help to exclude significant pathology, and reduce time and cost of diagnosis by eliminating the need for unnecessary medical imaging. Medical palpation is used regularly as an effective preliminary diagnosis method all around the world but years of training are required currently to achieve competency. This paper focuses on a multimodal palpation training system to teach and improve clinical examination skills in relation to the abdomen. It is our aim to shorten significantly the palpation training duration by increasing the frequency of rehearsals as well as providing essential augmented feedback on how to perform various abdominal palpation techniques which has been captured and modelled from medical experts. Twenty three first year medical students divided into a control group (n=8), a semi-visually trained group (n=8), and a fully visually trained group (n=7) were invited to perform three palpation tasks (superficial, deep and liver). The medical students performances were assessed using both computer-based and human-based methods where a positive correlation was shown between the generated scores, r=.62, p(one-tailed)<.05. The visually-trained group significantly outperformed the control group in which abstract visualisation of applied forces and their palmar locations were provided to the students during each palpation examination (p<.05). Moreover, a positive trend was observed between groups when visual feedback was presented, J=132, z=2.62, r=0.55.

Human-Computer Interaction Computer Vision and Pattern Recognition Image and Video Processing

Interactive Rainbow Score: A Visual-centered Multimodal Flute Tutoring System

58 - Daniel Chin , Yian Zhang , Tianyu Zhang 2020

Learning to play an instrument is intrinsically multimodal, and we have seen a trend of applying visual and haptic feedback in music games and computer-aided music tutoring systems. However, most current systems are still designed to master individual pieces of music; it is unclear how well the learned skills can be generalized to new pieces. We aim to explore this question. In this study, we contribute Interactive Rainbow Score, an interactive visual system to boost the learning of sight-playing, the general musical skill to read music and map the visual representations to performance motions. The key design of Interactive Rainbow Score is to associate pitches (and the corresponding motions) with colored notation and further strengthen such association via real-time interactions. Quantitative results show that the interactive feature on average increases the learning efficiency by 31.1%. Further analysis indicates that it is critical to apply the interaction in the early period of learning.

Human-Computer Interaction

Learning Online from Corrective Feedback: A Meta-Algorithm for Robotics

127 - Matthew Schmittle , Sanjiban Choudhury , Siddhartha S. Srinivasa 2021

A key challenge in Imitation Learning (IL) is that optimal state actions demonstrations are difficult for the teacher to provide. For example in robotics, providing kinesthetic demonstrations on a robotic manipulator requires the teacher to control multiple degrees of freedom at once. The difficulty of requiring optimal state action demonstrations limits the space of problems where the teacher can provide quality feedback. As an alternative to state action demonstrations, the teacher can provide corrective feedback such as their preferences or rewards. Prior work has created algorithms designed to learn from specific types of noisy feedback, but across teachers and tasks different forms of feedback may be required. Instead we propose that in order to learn from a diversity of scenarios we need to learn from a variety of feedback. To learn from a variety of feedback we make the following insight: the teachers cost function is latent and we can model a stream of feedback as a stream of loss functions. We then use any online learning algorithm to minimize the sum of these losses. With this insight we can learn from a diversity of feedback that is weakly correlated with the teachers true cost function. We unify prior work into a general corrective feedback meta-algorithm and show that regardless of feedback we can obtain the same regret bounds. We demonstrate our approach by learning to perform a household navigation task on a robotic racecar platform. Our results show that our approach can learn quickly from a variety of noisy feedback.

Robotics Machine Learning