ترغب بنشر مسار تعليمي؟ اضغط هنا

Medical SANSformers: Training self-supervised transformers without attention for Electronic Medical Records

132   0   0.0 ( 0 )
 نشر من قبل Yogesh Kumar
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We leverage deep sequential models to tackle the problem of predicting healthcare utilization for patients, which could help governments to better allocate resources for future healthcare use. Specifically, we study the problem of textit{divergent subgroups}, wherein the outcome distribution in a smaller subset of the population considerably deviates from that of the general population. The traditional approach for building specialized models for divergent subgroups could be problematic if the size of the subgroup is very small (for example, rare diseases). To address this challenge, we first develop a novel attention-free sequential model, SANSformers, instilled with inductive biases suited for modeling clinical codes in electronic medical records. We then design a task-specific self-supervision objective and demonstrate its effectiveness, particularly in scarce data settings, by pre-training each model on the entire health registry (with close to one million patients) before fine-tuning for downstream tasks on the divergent subgroups. We compare the novel SANSformer architecture with the LSTM and Transformer models using two data sources and a multi-task learning objective that aids healthcare utilization prediction. Empirically, the attention-free SANSformer models perform consistently well across experiments, outperforming the baselines in most cases by at least $sim 10$%. Furthermore, the self-supervised pre-training boosts performance significantly throughout, for example by over $sim 50$% (and as high as $800$%) on $R^2$ score when predicting the number of hospital visits.



قيم البحث

اقرأ أيضاً

60 - You Jin Kim 2017
Predicting highrisk vascular diseases is a significant issue in the medical domain. Most predicting methods predict the prognosis of patients from pathological and radiological measurements, which are expensive and require much time to be analyzed. H ere we propose deep attention models that predict the onset of the high risky vascular disease from symbolic medical histories sequence of hypertension patients such as ICD-10 and pharmacy codes only, Medical History-based Prediction using Attention Network (MeHPAN). We demonstrate two types of attention models based on 1) bidirectional gated recurrent unit (R-MeHPAN) and 2) 1D convolutional multilayer model (C-MeHPAN). Two MeHPAN models are evaluated on approximately 50,000 hypertension patients with respect to precision, recall, f1-measure and area under the curve (AUC). Experimental results show that our MeHPAN methods outperform standard classification models. Comparing two MeHPANs, R-MeHPAN provides more better discriminative capability with respect to all metrics while C-MeHPAN presents much shorter training time with competitive accuracy.
This paper reports our preliminary work on medical incident prediction in general, and fall risk prediction in specific, using machine learning. Data for the machine learning are generated only from the particular subset of the electronic medical rec ords (EMR) at Osaka Medical and Pharmaceutical University Hospital. As a result of conducting three experiments such as (1) machine learning algorithm comparison, (2) handling imbalance, and (3) investigation of explanatory variable contribution to the fall incident prediction, we find the investigation of explanatory variables the most effective.
Abbreviation disambiguation is important for automated clinical note processing due to the frequent use of abbreviations in clinical settings. Current models for automated abbreviation disambiguation are restricted by the scarcity and imbalance of la beled training data, decreasing their generalizability to orthogonal sources. In this work we propose a novel data augmentation technique that utilizes information from related medical concepts, which improves our models ability to generalize. Furthermore, we show that incorporating the global context information within the whole medical note (in addition to the traditional local context window), can significantly improve the models representation for abbreviations. We train our model on a public dataset (MIMIC III) and test its performance on datasets from different sources (CASI, i2b2). Together, these two techniques boost the accuracy of abbreviation disambiguation by almost 14% on the CASI dataset and 4% on i2b2.
In longitudinal electronic health records (EHRs), the event records of a patient are distributed over a long period of time and the temporal relations between the events reflect sufficient domain knowledge to benefit prediction tasks such as the rate of inpatient mortality. Medical concept embedding as a feature extraction method that transforms a set of medical concepts with a specific time stamp into a vector, which will be fed into a supervised learning algorithm. The quality of the embedding significantly determines the learning performance over the medical data. In this paper, we propose a medical concept embedding method based on applying a self-attention mechanism to represent each medical concept. We propose a novel attention mechanism which captures the contextual information and temporal relationships between medical concepts. A light-weight neural net, Temporal Self-Attention Network (TeSAN), is then proposed to learn medical concept embedding based solely on the proposed attention mechanism. To test the effectiveness of our proposed methods, we have conducted clustering and prediction tasks on two public EHRs datasets comparing TeSAN against five state-of-the-art embedding methods. The experimental results demonstrate that the proposed TeSAN model is superior to all the compared methods. To the best of our knowledge, this work is the first to exploit temporal self-attentive relations between medical events.
Few-shot semantic segmentation (FSS) has great potential for medical imaging applications. Most of the existing FSS techniques require abundant annotated semantic classes for training. However, these methods may not be applicable for medical images d ue to the lack of annotations. To address this problem we make several contributions: (1) A novel self-supervised FSS framework for medical images in order to eliminate the requirement for annotations during training. Additionally, superpixel-based pseudo-labels are generated to provide supervision; (2) An adaptive local prototype pooling module plugged into prototypical networks, to solve the common challenging foreground-background imbalance problem in medical image segmentation; (3) We demonstrate the general applicability of the proposed approach for medical images using three different tasks: abdominal organ segmentation for CT and MRI, as well as cardiac segmentation for MRI. Our results show that, for medical image segmentation, the proposed method outperforms conventional FSS methods which require manual annotations for training.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا