Deep Neural Network Based Respiratory Pathology Classification Using Cough Sounds

106 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Dorien Herremans

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Balamurali B T - Hwan Ing Hee - Saumitra Kapoor

التعلم الآلي الوسائط المتعددة أنظمة الصوت في الحاسوب

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Intelligent systems are transforming the world, as well as our healthcare system. We propose a deep learning-based cough sound classification model that can distinguish between children with healthy versus pathological coughs such as asthma, upper respiratory tract infection (URTI), and lower respiratory tract infection (LRTI). In order to train a deep neural network model, we collected a new dataset of cough sounds, labelled with clinicians diagnosis. The chosen model is a bidirectional long-short term memory network (BiLSTM) based on Mel Frequency Cepstral Coefficients (MFCCs) features. The resulting trained model when trained for classifying two classes of coughs -- healthy or pathology (in general or belonging to a specific respiratory pathology), reaches accuracy exceeding 84% when classifying cough to the label provided by the physicians diagnosis. In order to classify subjects respiratory pathology condition, results of multiple cough epochs per subject were combined. The resulting prediction accuracy exceeds 91% for all three respiratory pathologies. However, when the model is trained to classify and discriminate among the four classes of coughs, overall accuracy dropped: one class of pathological coughs are often misclassified as other. However, if one consider the healthy cough classified as healthy and pathological cough classified to have some kind of pathologies, then the overall accuracy of four class model is above 84%. A longitudinal study of MFCC feature space when comparing pathological and recovered coughs collected from the same subjects revealed the fact that pathological cough irrespective of the underlying conditions occupy the same feature space making it harder to differentiate only using MFCC features.

قيم البحث

262 - Madhurananda Pahar , Igor Miranda , Andreas Diacon 2021

We have performed cough detection based on measurements from an accelerometer attached to the patients bed. This form of monitoring is less intrusive than body-attached accelerometer sensors, and sidesteps privacy concerns encountered when using audi o for cough detection. For our experiments, we have compiled a manually-annotated dataset containing the acceleration signals of approximately 6000 cough and 68000 non-cough events from 14 adult male patients in a tuberculosis clinic. As classifiers, we have considered convolutional neural networks (CNN), long-short-term-memory (LSTM) networks, and a residual neural network (Resnet50). We find that all classifiers are able to distinguish between the acceleration signals due to coughing and those due to other activities including sneezing, throat-clearing and movement in the bed with high accuracy. The Resnet50 performs the best, achieving an area under the ROC curve (AUC) exceeding 0.98 in cross-validation experiments. We conclude that high-accuracy cough monitoring based only on measurements from the accelerometer in a consumer smartphone is possible. Since the need to gather audio is avoided and therefore privacy is inherently protected, and since the accelerometer is attached to the bed and not worn, this form of monitoring may represent a more convenient and readily accepted method of long-term patient cough monitoring.

التعلم الآلي أنظمة الصوت في الحاسوب معالجة الصوت والكلام

Deep Learning Framework Applied for Predicting Anomaly of Respiratory Sounds

163 - Dat Ngo , Lam Pham , Anh Nguyen 2020

This paper proposes a robust deep learning framework used for classifying anomaly of respiratory cycles. Initially, our framework starts with front-end feature extraction step. This step aims to transform the respiratory input sound into a two-dimens ional spectrogram where both spectral and temporal features are well presented. Next, an ensemble of C- DNN and Autoencoder networks is then applied to classify into four categories of respiratory anomaly cycles. In this work, we conducted experiments over 2017 Internal Conference on Biomedical Health Informatics (ICBHI) benchmark dataset. As a result, we achieve competitive performances with ICBHI average score of 0.49, ICBHI harmonic score of 0.42.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط أنظمة الصوت في الحاسوب

Co-Separating Sounds of Visual Objects

79 - Ruohan Gao , Kristen Grauman 2019

Learning how objects sound from video is challenging, since they often heavily overlap in a single audio channel. Current methods for visually-guided audio source separation sidestep the issue by training with artificially mixed video clips, but this puts unwieldy restrictions on training data collection and may even prevent learning the properties of true mixed sounds. We introduce a co-separation training paradigm that permits learning object-level sounds from unlabeled multi-source videos. Our novel training objective requires that the deep neural networks separated audio for similar-looking objects be consistently identifiable, while simultaneously reproducing accurate video-level audio tracks for each source training pair. Our approach disentangles sounds in realistic test videos, even in cases where an object was not observed individually during training. We obtain state-of-the-art results on visually-guided audio source separation and audio denoising for the MUSIC, AudioSet, and AV-Bench datasets.

الرؤية الحاسوبية وتمييز الأنماط الوسائط المتعددة أنظمة الصوت في الحاسوب

Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition

76 - Arun Kumar Singh n Indian Institute of Technology Jammu 2021

The recent developments in technology have re-warded us with amazing audio synthesis models like TACOTRON and WAVENETS. On the other side, it poses greater threats such as speech clones and deep fakes, that may go undetected. To tackle these alarming situations, there is an urgent need to propose models that can help discriminate a synthesized speech from an actual human speech and also identify the source of such a synthesis. Here, we propose a model based on Convolutional Neural Network (CNN) and Bidirectional Recurrent Neural Network (BiRNN) that helps to achieve both the aforementioned objectives. The temporal dependencies present in AI synthesized speech are exploited using Bidirectional RNN and CNN. The model outperforms the state-of-the-art approaches by classifying the AI synthesized audio from real human speech with an error rate of 1.9% and detecting the underlying architecture with an accuracy of 97%.

التعلم الآلي الوسائط المتعددة أنظمة الصوت في الحاسوب

QUCoughScope: An Artificially Intelligent Mobile Application to Detect Asymptomatic COVID-19 Patients using Cough and Breathing Sounds

80 - Muhammad E. H. Chowdhury , Nabil Ibtehaz , Tawsifur Rahman 2021

In the break of COVID-19 pandemic, mass testing has become essential to reduce the spread of the virus. Several recent studies suggest that a significant number of COVID-19 patients display no physical symptoms whatsoever. Therefore, it is unlikely t hat these patients will undergo COVID-19 test, which increases their chances of unintentionally spreading the virus. Currently, the primary diagnostic tool to detect COVID-19 is RT-PCR test on collected respiratory specimens from the suspected case. This requires patients to travel to a laboratory facility to be tested, thereby potentially infecting others along the way.It is evident from recent researches that asymptomatic COVID-19 patients cough and breath in a different way than the healthy people. Several research groups have created mobile and web-platform for crowdsourcing the symptoms, cough and breathing sounds from healthy, COVID-19 and Non-COVID patients. Some of these data repositories were made public. We have received such a repository from Cambridge University team under data-sharing agreement, where we have cough and breathing sound samples for 582 and 141 healthy and COVID-19 patients, respectively. 87 COVID-19 patients were asymptomatic, while rest of them have cough. We have developed an Android application to automatically screen COVID-19 from the comfort of people homes. Test subjects can simply download a mobile application, enter their symptoms, record an audio clip of their cough and breath, and upload the data anonymously to our servers. Our backend server converts the audio clip to spectrogram and then apply our state-of-the-art machine learning model to classify between cough sounds produced by COVID-19 patients, as opposed to healthy subjects or those with other respiratory conditions. The system can detect asymptomatic COVID-19 patients with a sensitivity more than 91%.

معالجة الصوت والكلام تفاعل الإنسان والحاسوب أنظمة الصوت في الحاسوب