No Arabic abstract
Thoracic diseases are very serious health problems that plague a large number of people. Chest X-ray is currently one of the most popular methods to diagnose thoracic diseases, playing an important role in the healthcare workflow. However, reading the chest X-ray images and giving an accurate diagnosis remain challenging tasks for expert radiologists. With the success of deep learning in computer vision, a growing number of deep neural network architectures were applied to chest X-ray image classification. However, most of the previous deep neural network classifiers were based on deterministic architectures which are usually very noise-sensitive and are likely to aggravate the overfitting issue. In this paper, to make a deep architecture more robust to noise and to reduce overfitting, we propose using deep generative classifiers to automatically diagnose thorax diseases from the chest X-ray images. Unlike the traditional deterministic classifier, a deep generative classifier has a distribution middle layer in the deep neural network. A sampling layer then draws a random sample from the distribution layer and input it to the following layer for classification. The classifier is generative because the class label is generated from samples of a related distribution. Through training the model with a certain amount of randomness, the deep generative classifiers are expected to be robust to noise and can reduce overfitting and then achieve good performances. We implemented our deep generative classifiers based on a number of well-known deterministic neural network architectures, and tested our models on the chest X-ray14 dataset. The results demonstrated the superiority of deep generative classifiers compared with the corresponding deep deterministic classifiers.
Chest X-rays (CXRs) are among the most commonly used medical image modalities. They are mostly used for screening, and an indication of disease typically results in subsequent tests. As this is mostly a screening test used to rule out chest abnormalities, the requesting clinicians are often interested in whether a CXR is normal or not. A machine learning algorithm that can accurately screen out even a small proportion of the real normal exams out of all requested CXRs would be highly beneficial in reducing the workload for radiologists. In this work, we report a deep neural network trained for classifying CXRs with the goal of identifying a large number of normal (disease-free) images without risking the discharge of sick patients. We use an ImageNet-pretrained Inception-ResNet-v2 model to provide the image features, which are further used to train a model on CXRs labelled by expert radiologists. The probability threshold for classification is optimized for 100% precision for the normal class, ensuring no sick patients are released. At this threshold we report an average recall of 50%. This means that the proposed solution has the potential to cut in half the number of disease-free CXRs examined by radiologists, without risking the discharge of sick patients.
Vision-and-language(V&L) models take image and text as input and learn to capture the associations between them. Prior studies show that pre-trained V&L models can significantly improve the model performance for downstream tasks such as Visual Question Answering (VQA). However, V&L models are less effective when applied in the medical domain (e.g., on X-ray images and clinical notes) due to the domain gap. In this paper, we investigate the challenges of applying pre-trained V&L models in medical applications. In particular, we identify that the visual representation in general V&L models is not suitable for processing medical data. To overcome this limitation, we propose BERTHop, a transformer-based model based on PixelHop++ and VisualBERT, for better capturing the associations between the two modalities. Experiments on the OpenI dataset, a commonly used thoracic disease diagnosis benchmark, show that BERTHop achieves an average Area Under the Curve (AUC) of 98.12% which is 1.62% higher than state-of-the-art (SOTA) while it is trained on a 9 times smaller dataset.
We systematically evaluate the performance of deep learning models in the presence of diseases not labeled for or present during training. First, we evaluate whether deep learning models trained on a subset of diseases (seen diseases) can detect the presence of any one of a larger set of diseases. We find that models tend to falsely classify diseases outside of the subset (unseen diseases) as no disease. Second, we evaluate whether models trained on seen diseases can detect seen diseases when co-occurring with diseases outside the subset (unseen diseases). We find that models are still able to detect seen diseases even when co-occurring with unseen diseases. Third, we evaluate whether feature representations learned by models may be used to detect the presence of unseen diseases given a small labeled set of unseen diseases. We find that the penultimate layer of the deep neural network provides useful features for unseen disease detection. Our results can inform the safe clinical deployment of deep learning models trained on a non-exhaustive set of disease classes.
Deep learning approaches have demonstrated remarkable progress in automatic Chest X-ray analysis. The data-driven feature of deep models requires training data to cover a large distribution. Therefore, it is substantial to integrate knowledge from multiple datasets, especially for medical images. However, learning a disease classification model with extra Chest X-ray (CXR) data is yet challenging. Recent researches have demonstrated that performance bottleneck exists in joint training on different CXR datasets, and few made efforts to address the obstacle. In this paper, we argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. Specifically, the imperfect data is in two folds: domain discrepancy, as the image appearances vary across datasets; and label discrepancy, as different datasets are partially labeled. To this end, we formulate the multi-label thoracic disease classification problem as weighted independent binary tasks according to the categories. For common categories shared across domains, we adopt task-specific adversarial training to alleviate the feature differences. For categories existing in a single dataset, we present uncertainty-aware temporal ensembling of model predictions to mine the information from the missing labels further. In this way, our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability. We conduct extensive experiments on three datasets with more than 360,000 Chest X-ray images. Our method outperforms other competing models and sets state-of-the-art performance on the official NIH test set with 0.8349 AUC, demonstrating its effectiveness of utilizing the external dataset to improve the internal classification.
With a Coronavirus disease (COVID-19) case count exceeding 10 million worldwide, there is an increased need for a diagnostic capability. The main variables in increasing diagnostic capability are reduced cost, turnaround or diagnosis time, and upfront equipment cost and accessibility. Two candidates for machine learning COVID-19 diagnosis are Computed Tomography (CT) scans and plain chest X-rays. While CT scans score higher in sensitivity, they have a higher cost, maintenance requirement, and turnaround time as compared to plain chest X-rays. The use of portable chest X-radiograph (CXR) is recommended by the American College of Radiology (ACR) since using CT places a massive burden on radiology services. Therefore, X-ray imagery paired with machine learning techniques is proposed a first-line triage tool for COVID-19 diagnostics. In this paper we propose a computer-aided diagnosis (CAD) to accurately classify chest X-ray scans of COVID-19 and normal subjects by fine-tuning several neural networks (ResNet18, ResNet50, DenseNet201) pre-trained on the ImageNet dataset. These neural networks are fused in a parallel architecture and the voting criteria are applied in the final classification decision between the candidate object classes where the output of each neural network is representing a single vote. Several experiments are conducted on the weakly labeled COVID-19-CT-CXR dataset consisting of 263 COVID-19 CXR images extracted from PubMed Central Open Access subsets combined with 25 normal classification CXR images. These experiments show an optimistic result and a capability of the proposed model to outperforming many state-of-the-art algorithms on several measures. Using k-fold cross-validation and a bagging classifier ensemble, we achieve an accuracy of 99.7% and a sensitivity of 100%.