Automatic Cough Classification for Tuberculosis Screening in a Real-World Environment


Abstract in English

We present first results showing that it is possible to automatically discriminate between the coughing sounds produced by patients with tuberculosis (TB) and those produced by patients with other lung ailments in a real-world noisy environment. Our experiments are based on a dataset of cough recordings obtained in a real-world clinic setting from 16 patients confirmed to be suffering from TB and 33 patients that are suffering from respiratory conditions, confirmed as other than TB. We have trained and evaluated several machine learning classifiers, including logistic regression (LR), support vector machines (SVM), k-nearest neighbour (KNN), multilayer perceptrons (MLP) and convolutional neural networks (CNN) inside a nested k-fold cross-validation and find that, although classification is possible in all cases, the best performance is achieved using the LR classifier. In combination with feature selection by sequential forward search (SFS), our best LR system achieves an area under the ROC curve (AUC) of 0.94 using 23 features selected from a set of 78 high-resolution mel-frequency cepstral coefficients (MFCCs). This system achieves a sensitivity of 93% at a specificity of 95% and thus exceeds the 90% sensitivity at 70% specificity specification considered by the WHO as minimal requirements for community-based TB triage test. We conclude that automatic classification of cough audio sounds is promising as a viable means of low-cost easily-deployable front-line screening for TB, which will greatly benefit developing countries with a heavy TB burden.

Download