ترغب بنشر مسار تعليمي؟ اضغط هنا

Detection of COVID-19 through the analysis of vocal fold oscillations

95   0   0.0 ( 0 )
 نشر من قبل Mahmoud Al Ismail
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Phonation, or the vibration of the vocal folds, is the primary source of vocalization in the production of voiced sounds by humans. It is a complex bio-mechanical process that is highly sensitive to changes in the speakers respiratory parameters. Since most symptomatic cases of COVID-19 present with moderate to severe impairment of respiratory functions, we hypothesize that signatures of COVID-19 may be observable by examining the vibrations of the vocal folds. Our goal is to validate this hypothesis, and to quantitatively characterize the changes observed to enable the detection of COVID-19 from voice. For this, we use a dynamical system model for the oscillation of the vocal folds, and solve it using our recently developed ADLES algorithm to yield vocal fold oscillation patterns directly from recorded speech. Experimental results on a clinically curated dataset of COVID-19 positive and negative subjects reveal characteristic patterns of vocal fold oscillations that are correlated with COVID-19. We show that these are prominent and discriminative enough that even simple classifiers such as logistic regression yields high detection accuracies using just the recordings of isolated extended vowels.



قيم البحث

اقرأ أيضاً

Background: The inability to test at scale has become humanitys Achilles heel in the ongoing war against the COVID-19 pandemic. A scalable screening tool would be a game changer. Building on the prior work on cough-based diagnosis of respiratory dise ases, we propose, develop and test an Artificial Intelligence (AI)-powered screening solution for COVID-19 infection that is deployable via a smartphone app. The app, named AI4COVID-19 records and sends three 3-second cough sounds to an AI engine running in the cloud, and returns a result within two minutes. Methods: Cough is a symptom of over thirty non-COVID-19 related medical conditions. This makes the diagnosis of a COVID-19 infection by cough alone an extremely challenging multidisciplinary problem. We address this problem by investigating the distinctness of pathomorphological alterations in the respiratory system induced by COVID-19 infection when compared to other respiratory infections. To overcome the COVID-19 cough training data shortage we exploit transfer learning. To reduce the misdiagnosis risk stemming from the complex dimensionality of the problem, we leverage a multi-pronged mediator centered risk-averse AI architecture. Results: Results show AI4COVID-19 can distinguish among COVID-19 coughs and several types of non-COVID-19 coughs. The accuracy is promising enough to encourage a large-scale collection of labeled cough data to gauge the generalization capability of AI4COVID-19. AI4COVID-19 is not a clinical grade testing tool. Instead, it offers a screening tool deployable anytime, anywhere, by anyone. It can also be a clinical decision assistance tool used to channel clinical-testing and treatment to those who need it the most, thereby saving more lives.
In the pathogenesis of COVID-19, impairment of respiratory functions is often one of the key symptoms. Studies show that in these cases, voice production is also adversely affected -- vocal fold oscillations are asynchronous, asymmetrical and more re stricted during phonation. This paper proposes a method that analyzes the differential dynamics of the glottal flow waveform (GFW) during voice production to identify features in them that are most significant for the detection of COVID-19 from voice. Since it is hard to measure this directly in COVID-19 patients, we infer it from recorded speech signals and compare it to the GFW computed from physical model of phonation. For normal voices, the difference between the two should be minimal, since physical models are constructed to explain phonation under assumptions of normalcy. Greater differences implicate anomalies in the bio-physical factors that contribute to the correctness of the physical model, revealing their significance indirectly. Our proposed method uses a CNN-based 2-step attention model that locates anomalies in time-feature space in the difference of the two GFWs, allowing us to infer their potential as discriminative features for classification. The viability of this method is demonstrated using a clinically curated dataset of COVID-19 positive and negative subjects.
73 - Jing Han , Kun Qian , Meishu Song 2020
The COVID-19 outbreak was announced as a global pandemic by the World Health Organisation in March 2020 and has affected a growing number of people in the past few weeks. In this context, advanced artificial intelligence techniques are brought to the fore in responding to fight against and reduce the impact of this global health crisis. In this study, we focus on developing some potential use-cases of intelligent speech analysis for COVID-19 diagnosed patients. In particular, by analysing speech recordings from these patients, we construct audio-only-based models to automatically categorise the health state of patients from four aspects, including the severity of illness, sleep quality, fatigue, and anxiety. For this purpose, two established acoustic feature sets and support vector machines are utilised. Our experiments show that an average accuracy of .69 obtained estimating the severity of illness, which is derived from the number of days in hospitalisation. We hope that this study can foster an extremely fast, low-cost, and convenient way to automatically detect the COVID-19 disease.
We propose a flexible framework that deals with both singer conversion and singers vocal technique conversion. The proposed model is trained on non-parallel corpora, accommodates many-to-many conversion, and leverages recent advances of variational a utoencoders. It employs separate encoders to learn disentangled latent representations of singer identity and vocal technique separately, with a joint decoder for reconstruction. Conversion is carried out by simple vector arithmetic in the learned latent spaces. Both a quantitative analysis as well as a visualization of the converted spectrograms show that our model is able to disentangle singer identity and vocal technique and successfully perform conversion of these attributes. To the best of our knowledge, this is the first work to jointly tackle conversion of singer identity and vocal technique based on a deep learning approach.
Disease-modifying treatments are currently assessed in neurodegenerative diseases. Huntingtons Disease represents a unique opportunity to design automatic sub-clinical markers, even in premanifest gene carriers. We investigated phonatory impairments as potential clinical markers and propose them for both diagnosis and gene carriers follow-up. We used two sets of features: Phonatory features and Modulation Power Spectrum Features. We found that phonation is not sufficient for the identification of sub-clinical disorders of premanifest gene carriers. According to our regression results, Phonatory features are suitable for the predictions of clinical performance in Huntingtons Disease.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا