Sounds of COVID-19: exploring realistic performance of audio-based digital testing

172 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tong Xia

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jing Han - Tong Xia - Dimitris Spathis

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Researchers have been battling with the question of how we can identify Coronavirus disease (COVID-19) cases efficiently, affordably and at scale. Recent work has shown how audio based approaches, which collect respiratory audio data (cough, breathing and voice) can be used for testing, however there is a lack of exploration of how biases and methodological decisions impact these tools performance in practice. In this paper, we explore the realistic performance of audio-based digital testing of COVID-19. To investigate this, we collected a large crowdsourced respiratory audio dataset through a mobile app, alongside recent COVID-19 test result and symptoms intended as a ground truth. Within the collected dataset, we selected 5,240 samples from 2,478 participants and split them into different participant-independent sets for model development and validation. Among these, we controlled for potential confounding factors (such as demographics and language). The unbiased model takes features extracted from breathing, coughs, and voice signals as predictors and yields an AUC-ROC of 0.71 (95% CI: 0.65$-$0.77). We further explore different unbalanced distributions to show how biases and participant splits affect performance. Finally, we discuss how the realistic model presented could be integrated in clinical practice to realize continuous, ubiquitous, sustainable and affordable testing at population scale.

قيم البحث

139 - Piyush Bagad , Aman Dalmia , Jigar Doshi 2020

Testing capacity for COVID-19 remains a challenge globally due to the lack of adequate supplies, trained personnel, and sample-processing equipment. These problems are even more acute in rural and underdeveloped regions. We demonstrate that solicited -cough sounds collected over a phone, when analysed by our AI model, have statistically significant signal indicative of COVID-19 status (AUC 0.72, t-test,p <0.01,95% CI 0.61-0.83). This holds true for asymptomatic patients as well. Towards this, we collect the largest known(to date) dataset of microbiologically confirmed COVID-19 cough sounds from 3,621 individuals. When used in a triaging step within an overall testing protocol, by enabling risk-stratification of individuals before confirmatory tests, our tool can increase the testing capacity of a healthcare system by 43% at disease prevalence of 5%, without additional supplies, trained personnel, or physical infrastructure

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

Audio feature ranking for sound-based COVID-19 patient detection

189 - Julia A. Meister , Khuong An Nguyen , Zhiyuan Luo 2021

Audio classification using breath and cough samples has recently emerged as a low-cost, non-invasive, and accessible COVID-19 screening method. However, no application has been approved for official use at the time of writing due to the stringent rel iability and accuracy requirements of the critical healthcare setting. To support the development of the Machine Learning classification models, we performed an extensive comparative investigation and ranking of 15 audio features, including less well-known ones. The results were verified on two independent COVID-19 sound datasets. By using the identified top-performing features, we have increased the COVID-19 classification accuracy by up to 17% on the Cambridge dataset, and up to 10% on the Coswara dataset, compared to the original baseline accuracy without our feature ranking.

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data

377 - Chloe Brown , Jagmohan Chauhan , Andreas Grammenos 2020

Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease progression. Until recently, such signals were usually c ollected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from digital stethoscopes) for cardiovascular or respiratory examination, which could then be used for automatic analysis. Some initial work shows promise in detecting diagnostic signals of COVID-19 from voice and coughs. In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. We use coughs and breathing to understand how discernible COVID-19 sounds are from those in asthma or healthy controls. Our results show that even a simple binary machine learning classifier is able to classify correctly healthy and COVID-19 sounds. We also show how we distinguish a user who tested positive for COVID-19 and has a cough from a healthy user with a cough, and users who tested positive for COVID-19 and have a cough from users with asthma and a cough. Our models achieve an AUC of above 80% across all tasks. These results are preliminary and only scratch the surface of the potential of this type of data and audio-based machine learning. This work opens the door to further investigation of how automatically analysed respiratory patterns could be used as pre-screening signals to aid COVID-19 diagnosis.

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

Uncertainty-Aware COVID-19 Detection from Imbalanced Sound Data

144 - Tong Xia , Jing Han , Lorena Qendro 2021

Recently, sound-based COVID-19 detection studies have shown great promise to achieve scalable and prompt digital pre-screening. However, there are still two unsolved issues hindering the practice. First, collected datasets for model training are ofte n imbalanced, with a considerably smaller proportion of users tested positive, making it harder to learn representative and robust features. Second, deep learning models are generally overconfident in their predictions. Clinically, false predictions aggravate healthcare costs. Estimation of the uncertainty of screening would aid this. To handle these issues, we propose an ensemble framework where multiple deep learning models for sound-based COVID-19 detection are developed from different but balanced subsets from original data. As such, data are utilized more effectively compared to traditional up-sampling and down-sampling approaches: an AUC of 0.74 with a sensitivity of 0.68 and a specificity of 0.69 is achieved. Simultaneously, we estimate uncertainty from the disagreement across multiple models. It is shown that false predictions often yield higher uncertainty, enabling us to suggest the users with certainty higher than a threshold to repeat the audio test on their phones or to take clinical tests if digital diagnosis still fails. This study paves the way for a more robust sound-based COVID-19 automated screening system.

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام

Impact of data-splits on generalization: Identifying COVID-19 from cough and context

98 - Makkunda Sharma , Nikhil Shenoy , Jigar Doshi 2021

Rapidly scaling screening, testing and quarantine has shown to be an effective strategy to combat the COVID-19 pandemic. We consider the application of deep learning techniques to distinguish individuals with COVID from non-COVID by using data acquir able from a phone. Using cough and context (symptoms and meta-data) represent such a promising approach. Several independent works in this direction have shown promising results. However, none of them report performance across clinically relevant data splits. Specifically, the performance where the development and test sets are split in time (retrospective validation) and across sites (broad validation). Although there is meaningful generalization across these splits the performance significantly varies (up to 0.1 AUC score). In addition, we study the performance of symptomatic and asymptomatic individuals across these three splits. Finally, we show that our model focuses on meaningful features of the input, cough bouts for cough and relevant symptoms for context. The code and checkpoints are available at https://github.com/WadhwaniAI/cough-against-covid

أنظمة الصوت في الحاسوب التعلم الآلي معالجة الصوت والكلام