ترغب بنشر مسار تعليمي؟ اضغط هنا

Modeling Accurate Human Activity Recognition for Embedded Devices Using Multi-level Distillation

157   0   0.0 ( 0 )
 نشر من قبل Runze Chen
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Human Activity Recognition (HAR) based on IMU sensors is a crucial area in ubiquitous computing. Because of the trend of deploying AI on IoT devices or smartphones, more researchers are designing different HAR models for embedded devices. Deployment of models in embedded devices can help enhance the efficiency of HAR. We propose a multi-level HAR modeling pipeline called Stage-Logits-Memory Distillation (SMLDist) for constructing deep convolutional HAR models with embedded hardware support. SMLDist includes stage distillation, memory distillation, and logits distillation. Stage distillation constrains the learning direction of the intermediate features. The teacher model teaches the student models how to explain and store the inner relationship among high-dimensional features based on Hopfield networks in memory distillation. Logits distillation builds logits distilled by a smoothed conditional rule to preserve the probability distribution and enhance the softer target accuracy. We compare the accuracy, F1 macro score, and energy cost on embedded platforms of a MobileNet V3 model built by SMLDist with various state-of-the-art HAR frameworks. The product model has a good balance with robustness and efficiency. SMLDist can also compress models with a minor performance loss at an equal compression ratio to other advanced knowledge distillation methods on seven public datasets.



قيم البحث

اقرأ أيضاً

Human Activity Recognition (HAR) is considered a valuable research topic in the last few decades. Different types of machine learning models are used for this purpose, and this is a part of analyzing human behavior through machines. It is not a trivi al task to analyze the data from wearable sensors for complex and high dimensions. Nowadays, researchers mostly use smartphones or smart home sensors to capture these data. In our paper, we analyze these data using machine learning models to recognize human activities, which are now widely used for many purposes such as physical and mental health monitoring. We apply different machine learning models and compare performances. We use Logistic Regression (LR) as the benchmark model for its simplicity and excellent performance on a dataset, and to compare, we take Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN). Additionally, we select the best set of parameters for each model by grid search. We use the HAR dataset from the UCI Machine Learning Repository as a standard dataset to train and test the models. Throughout the analysis, we can see that the Support Vector Machine performed (average accuracy 96.33%) far better than the other methods. We also prove that the results are statistically significant by employing statistical significance test methods.
This study presents a novel method to recognize human physical activities using CNN followed by LSTM. Achieving high accuracy by traditional machine learning algorithms, (such as SVM, KNN and random forest method) is a challenging task because the da ta acquired from the wearable sensors like accelerometer and gyroscope is a time-series data. So, to achieve high accuracy, we propose a multi-head CNN model comprising of three CNNs to extract features for the data acquired from different sensors and all three CNNs are then merged, which are followed by an LSTM layer and a dense layer. The configuration of all three CNNs is kept the same so that the same number of features are obtained for every input to CNN. By using the proposed method, we achieve state-of-the-art accuracy, which is comparable to traditional machine learning algorithms and other deep neural network algorithms.
192 - Ziyu Jia , Youfang Lin , Jing Wang 2021
The research on human emotion under multimedia stimulation based on physiological signals is an emerging field, and important progress has been achieved for emotion recognition based on multi-modal signals. However, it is challenging to make full use of the complementarity among spatial-spectral-temporal domain features for emotion recognition, as well as model the heterogeneity and correlation among multi-modal signals. In this paper, we propose a novel two-stream heterogeneous graph recurrent neural network, named HetEmotionNet, fusing multi-modal physiological signals for emotion recognition. Specifically, HetEmotionNet consists of the spatial-temporal stream and the spatial-spectral stream, which can fuse spatial-spectral-temporal domain features in a unified framework. Each stream is composed of the graph transformer network for modeling the heterogeneity, the graph convolutional network for modeling the correlation, and the gated recurrent unit for capturing the temporal domain or spectral domain dependency. Extensive experiments on two real-world datasets demonstrate that our proposed model achieves better performance than state-of-the-art baselines.
Successful teaching requires an assumption of how the learner learns - how the learner uses experiences from the world to update their internal states. We investigate what expectations people have about a learner when they teach them in an online man ner using rewards and punishment. We focus on a common reinforcement learning method, Q-learning, and examine what assumptions people have using a behavioral experiment. To do so, we first establish a normative standard, by formulating the problem as a machine teaching optimization problem. To solve the machine teaching optimization problem, we use a deep learning approximation method which simulates learners in the environment and learns to predict how feedback affects the learners internal states. What do people assume about a learners learning and discount rates when they teach them an idealized exploration-exploitation task? In a behavioral experiment, we find that people can teach the task to Q-learners in a relatively efficient and effective manner when the learner uses a small value for its discounting rate and a large value for its learning rate. However, they still are suboptimal. We also find that providing people with real-time updates of how possible feedback would affect the Q-learners internal states weakly helps them teach. Our results reveal how people teach using evaluative feedback and provide guidance for how engineers should design machine agents in a manner that is intuitive for people.
Data augmentation is a widely used technique in classification to increase data used in training. It improves generalization and reduces amount of annotated human activity data needed for training which reduces labour and time needed with the dataset . Sensor time-series data, unlike images, cannot be augmented by computationally simple transformation algorithms. State of the art models like Recurrent Generative Adversarial Networks (RGAN) are used to generate realistic synthetic data. In this paper, transformer based generative adversarial networks which have global attention on data, are compared on PAMAP2 and Real World Human Activity Recognition data sets with RGAN. The newer approach provides improvements in time and savings in computational resources needed for data augmentation than previous approach.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا