ﻻ يوجد ملخص باللغة العربية
Cardiovascular diseases are the leading cause of deaths and severely threaten human health in daily life. On the one hand, there have been dramatically increasing demands from both the clinical practice and the smart home application for monitoring the heart status of subjects suffering from chronic cardiovascular diseases. On the other hand, experienced physicians who can perform an efficient auscultation are still lacking in terms of number. Automatic heart sound classification leveraging the power of advanced signal processing and machine learning technologies has shown encouraging results. Nevertheless, human hand-crafted features are expensive and time-consuming. To this end, we propose a novel deep representation learning method with an attention mechanism for heart sound classification. In this paradigm, high-level representations are learnt automatically from the recorded heart sound data. Particularly, a global attention pooling layer improves the performance of the learnt representations by estimating the contribution of each unit in feature maps. The Heart Sounds Shenzhen (HSS) corpus (170 subjects involved) is used to validate the proposed method. Experimental results validate that, our approach can achieve an unweighted average recall of 51.2% for classifying three categories of heart sounds, i. e., normal, mild, and moderate/severe annotated by cardiologists with the help of Echocardiography.
This thesis focuses on dealing with the task of acoustic scene classification (ASC), and then applied the techniques developed for ASC to a real-life application of detecting respiratory disease. To deal with ASC challenges, this thesis addresses thr
In Psychology, actions are paramount for humans to identify sound events. In Machine Learning (ML), action recognition achieves high accuracy; however, it has not been asked whether identifying actions can benefit Sound Event Classification (SEC), as
This paper proposes a deep learning framework for classification of BBC television programmes using audio. The audio is firstly transformed into spectrograms, which are fed into a pre-trained convolutional Neural Network (CNN), obtaining predicted pr
This paper proposes an noise type classification aided attention-based neural network approach for monaural speech enhancement. The network is constructed based on a previous work by introducing a noise classification subnetwork into the structure an
In this paper, we present deep learning frameworks for audio-visual scene classification (SC) and indicate how individual visual and audio features as well as their combination affect SC performance. Our extensive experiments, which are conducted on