ﻻ يوجد ملخص باللغة العربية
Electronic nose has been proven to be effective in alternative herbal medicine classification, but due to the nature of supervised learning, previous research heavily relies on the labelled training data, which are time-costly and labor-intensive to collect. To alleviate the critical dependency on the training data in real-world applications, this study aims to improve classification accuracy via data augmentation strategies. The effectiveness of five data augmentation strategies under different training data inadequacy are investigated in two scenarios: the noise-free scenario where different availabilities of unlabelled data were considered, and the noisy scenario where different levels of Gaussian noises and translational shifts were added to represent sensor drifts. The five augmentation strategies, namely noise-adding data augmentation, semi-supervised learning, classifier-based online learning, Inductive Conformal Prediction (ICP) online learning and our novel ensemble ICP online learning proposed in this study, are experimented and compared against supervised learning baseline, with Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) as the classifiers. Our novel strategy, ensemble ICP online learning, outperforms the others by showing non-decreasing classification accuracy on all tasks and a significant improvement on most simulated tasks (25out of 36 tasks,p<=0.05). Furthermore, this study provides a systematic analysis of different augmentation strategies. It shows at least one strategy significantly improved the classification accuracy with LDA (p<=0.05) and non-decreasing classification accuracy with SVM in each task. In particular, our proposed strategy demonstrated both effectiveness and robustness in boosting the classification model generalizability, which can be employed in other machine learning applications.
In machine learning applications, the reliability of predictions is significant for assisted decision and risk control. As an effective framework to quantify the prediction reliability, conformal prediction (CP) was developed with the CPKNN (CP with
The origins of herbal medicines are important for their treatment effect, which could be potentially distinguished by electronic nose system. As the odor fingerprint of herbal medicines from different origins can be tiny, the discrimination of origin
Data augmentation by mixing samples, such as Mixup, has widely been used typically for classification tasks. However, this strategy is not always effective due to the gap between augmented samples for training and original samples for testing. This g
Medical imaging is a domain which suffers from a paucity of manually annotated data for the training of learning algorithms. Manually delineating pathological regions at a pixel level is a time consuming process, especially in 3D images, and often re
The Graph Convolutional Networks (GCNs) proposed by Kipf and Welling are effective models for semi-supervised learning, but facing the obstacle of over-smoothing, which will weaken the representation ability of GCNs. Recently some works are proposed