ترغب بنشر مسار تعليمي؟ اضغط هنا

67 - Lawrence H. Kim 2020
Prior work demonstrated the potential of using Linear Predictive Coding (LPC) to approximate muscle stiffness and damping from computer mouse (a.k.a. mouse) movement to predict stress levels of users. Theoretically, muscle stiffness in the arm can be estimated using a mass-spring-damper (MSD) biomechanical model of the arm. However, the damping frequency and damping ratio values derived using LPC have not yet been compared with those from the theoretical MSD model. In this work, we demonstrate the damping frequency and damping ratio from LPC are significantly correlated with those from MSD model, thus confirming the validity of using LPC to infer muscle stiffness and damping. We also compare the stress level binary classification performance using the values from LPC and MSD with each other and with neural network-based baselines. We found comparable performance across all conditions demonstrating the efficacy of LPC and MSD model-based stress prediction.
Machine learning approaches for building task-oriented dialogue systems require large conversational datasets with labels to train on. We are interested in building task-oriented dialogue systems from human-human conversations, which may be available in ample amounts in existing customer care center logs or can be collected from crowd workers. Annotating these datasets can be prohibitively expensive. Recently multiple annotated task-oriented human-machine dialogue datasets have been released, however their annotation schema varies across different collections, even for well-defined categories such as dialogue acts (DAs). We propose a Universal DA schema for task-oriented dialogues and align existing annotated datasets with our schema. Our aim is to train a Universal DA tagger (U-DAT) for task-oriented dialogues and use it for tagging human-human conversations. We investigate multiple datasets, propose manual and automated approaches for aligning the different schema, and present results on a target corpus of human-human dialogues. In unsupervised learning experiments we achieve an F1 score of 54.1% on system turns in human-human dialogues. In a semi-supervised setup, the F1 score increases to 57.7% which would otherwise require at least 1.7K manually annotated turns. For new domains, we show further improvements when unlabeled or labeled target domain data is available.
MultiWOZ 2.0 (Budzianowski et al., 2018) is a recently released multi-domain dialogue dataset spanning 7 distinct domains and containing over 10,000 dialogues. Though immensely useful and one of the largest resources of its kind to-date, MultiWOZ 2.0 has a few shortcomings. Firstly, there is substantial noise in the dialogue state annotations and dialogue utterances which negatively impact the performance of state-tracking models. Secondly, follow-up work (Lee et al., 2019) has augmented the original dataset with user dialogue acts. This leads to multiple co-existe
Recent works on end-to-end trainable neural network based approaches have demonstrated state-of-the-art results on dialogue state tracking. The best performing approaches estimate a probability distribution over all possible slot values. However, the se approaches do not scale for large value sets commonly present in real-life applications and are not ideal for tracking slot values that were not observed in the training set. To tackle these issues, candidate-generation-based approaches have been proposed. These approaches estimate a set of values that are possible at each turn based on the conversation history and/or language understanding outputs, and hence enable state tracking over unseen values and large value sets however, they fall short in terms of performance in comparison to the first group. In this work, we analyze the performance of these two alternative dialogue state tracking methods, and present a hybrid approach (HyST) which learns the appropriate method for each slot type. To demonstrate the effectiveness of HyST on a rich-set of slot types, we experiment with the recently released MultiWOZ-2.0 multi-domain, task-oriented dialogue-dataset. Our experiments show that HyST scales to multi-domain applications. Our best performing model results in a relative improvement of 24% and 10% over the previous SOTA and our best baseline respectively.
Goal-oriented dialogue systems typically rely on components specifically developed for a single task or domain. This limits such systems in two different ways: If there is an update in the task domain, the dialogue system usually needs to be updated or completely re-trained. It is also harder to extend such dialogue systems to different and multiple domains. The dialogue state tracker in conventional dialogue systems is one such component - it is usually designed to fit a well-defined application domain. For example, it is common for a state variable to be a categorical distribution over a manually-predefined set of entities (Henderson et al., 2013), resulting in an inflexible and hard-to-extend dialogue system. In this paper, we propose a new approach for dialogue state tracking that can generalize well over multiple domains without incorporating any domain-specific knowledge. Under this framework, discrete dialogue state variables are learned independently and the information of a predefined set of possible values for dialogue state variables is not required. Furthermore, it enables adding arbitrary dialogue context as features and allows for multiple values to be associated with a single state variable. These characteristics make it much easier to expand the dialogue state space. We evaluate our framework using the widely used dialogue state tracking challenge data set (DSTC2) and show that our framework yields competitive results with other state-of-the-art results despite incorporating little domain knowledge. We also show that this framework can benefit from widely available external resources such as pre-trained word embeddings.
Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce signif icant latency. We propose a compression method that leverages low rank matrix factorization during training,to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.
Street imagery is a promising big data source providing current and historical images in more than 100 countries. Previous studies used this data to audit built environment features. Here we explore a novel application, using Google Street View (GSV) to predict travel patterns at the city level. We sampled 34 cities in Great Britain. In each city, we accessed GSV images from 1000 random locations from years overlapping with the 2011 Census and the 2011-2013 Active People Survey (APS). We manually annotated images into seven categories of road users. We developed regression models with the counts of images of road users as predictors. Outcomes included Census-reported commute shares of four modes (walking plus public transport, cycling, motorcycle, and car), and APS-reported past-month participation in walking and cycling. In bivariate analyses, we found high correlations between GSV counts of cyclists (GSV-cyclists) and cycle commute mode share (r=0.92) and past-month cycling (r=0.90). Likewise, GSV-pedestrians was moderately correlated with past-month walking for transport (r=0.46), GSV-motorcycles was moderately correlated with commute share of motorcycles (r=0.44), and GSV-buses was highly correlated with commute share of walking plus public transport (r=0.81). GSV-car was not correlated with car commute mode share (r=-0.12). However, in multivariable regression models, all mode shares were predicted well. Cross-validation analyses showed good prediction performance for all the outcomes except past-month walking. Street imagery is a promising new big data source to predict urban mobility patterns. Further testing across multiple settings is warranted both for cross-sectional and longitudinal assessments.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا