Characterizing the Value of Information in Medical Notes

71 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chao-Chun Hsu

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Chao-Chun Hsu - Shantanu Karnwal - Sendhil Mullainathan

الحساب واللغة الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Machine learning models depend on the quality of input data. As electronic health records are widely adopted, the amount of data in health care is growing, along with complaints about the quality of medical notes. We use two prediction tasks, readmission prediction and in-hospital mortality prediction, to characterize the value of information in medical notes. We show that as a whole, medical notes only provide additional predictive power over structured information in readmission prediction. We further propose a probing framework to select parts of notes that enable more accurate predictions than using all notes, despite that the selected information leads to a distribution shift from the training data (all notes). Finally, we demonstrate that models trained on the selected valuable information achieve even better predictive performance, with only 6.8% of all the tokens for readmission prediction.

قيم البحث

143 - Samuel Carton , Anirudh Rathore , Chenhao Tan 2020

Two main approaches for evaluating the quality of machine-generated rationales are: 1) using human rationales as a gold standard; and 2) automated metrics based on how rationales affect model behavior. An open question, however, is how human rational es fare with these automatic metrics. Analyzing a variety of datasets and models, we find that human rationales do not necessarily perform well on these metrics. To unpack this finding, we propose improved metrics to account for model-dependent baseline performance. We then propose two methods to further characterize rationale quality, one based on model retraining and one on using fidelity curves to reveal properties such as irrelevance and redundancy. Our work leads to actionable suggestions for evaluating and characterizing rationales.

الحساب واللغة الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع

Convolutional Neural Networks for Medical Diagnosis from Admission Notes

128 - Christy Li , Dimitris Konomis , Graham Neubig 2017

$textbf{Objective}$ Develop an automatic diagnostic system which only uses textual admission information from Electronic Health Records (EHRs) and assist clinicians with a timely and statistically proved decision tool. The hope is that the tool can b e used to reduce mis-diagnosis. $textbf{Materials and Methods}$ We use the real-world clinical notes from MIMIC-III, a freely available dataset consisting of clinical data of more than forty thousand patients who stayed in intensive care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. We proposed a Convolutional Neural Network model to learn semantic features from unstructured textual input and automatically predict primary discharge diagnosis. $textbf{Results}$ The proposed model achieved an overall 96.11% accuracy and 80.48% weighted F1 score values on 10 most frequent disease classes, significantly outperforming four strong baseline models by at least 12.7% in weighted F1 score. $textbf{Discussion}$ Experimental results imply that the CNN model is suitable for supporting diagnosis decision making in the presence of complex, noisy and unstructured clinical data while at the same time using fewer layers and parameters that other traditional Deep Network models. $textbf{Conclusion}$ Our model demonstrated capability of representing complex medical meaningful features from unstructured clinical notes and prediction power for commonly misdiagnosed frequent diseases. It can use easily adopted in clinical setting to provide timely and statistically proved decision support. $textbf{Keywords}$ Convolutional neural network, text classification, discharge diagnosis prediction, admission information from EHRs.

الحساب واللغة

On the Diversity and Limits of Human Explanations

128 - Chenhao Tan 2021

A growing effort in NLP aims to build datasets of human explanations. However, the term explanation encompasses a broad range of notions, each with different properties and ramifications. Our goal is to provide an overview of diverse types of explana tions and human limitations, and discuss implications for collecting and using explanations in NLP. Inspired by prior work in psychology and cognitive sciences, we group existing human explanations in NLP into three categories: proximal mechanism, evidence, and procedure. These three types differ in nature and have implications for the resultant explanations. For instance, procedure is not considered explanations in psychology and connects with a rich body of work on learning from instructions. The diversity of explanations is further evidenced by proxy questions that are needed for annotators to interpret and answer open-ended why questions. Finally, explanations may require different, often deeper, understandings than predictions, which casts doubt on whether humans can provide useful explanations in some tasks.

الحساب واللغة الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع

MedDG: A Large-scale Medical Consultation Dataset for Building Medical Dialogue System

126 - Wenge Liu , Jianheng Tang , Jinghui Qin 2020

Developing conversational agents to interact with patients and provide primary clinical advice has attracted increasing attention due to its huge application potential, especially in the time of COVID-19 Pandemic. However, the training of end-to-end neural-based medical dialogue system is restricted by an insufficient quantity of medical dialogue corpus. In this work, we make the first attempt to build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG, with more than 17K conversations collected from the online health consultation community. Five different categories of entities, including diseases, symptoms, attributes, tests, and medicines, are annotated in each conversation of MedDG as additional labels. To push forward the future research on building expert-sensitive medical dialogue system, we proposes two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation. To acquire a clear comprehension on these two medical dialogue tasks, we implement several state-of-the-art benchmarks, as well as design two dialogue models with a further consideration on the predicted entities. Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset, and the response quality can be enhanced with the help of auxiliary entity information. From human evaluation, the simple retrieval model outperforms several state-of-the-art generative models, indicating that there still remains a large room for improvement on generating medically meaningful responses.

الحساب واللغة الذكاء الاصطناعي

Release Strategies and the Social Impacts of Language Models

197 - Irene Solaiman , Miles Brundage , Jack Clark 2019

Large language models have a range of beneficial uses: they can assist in prose, poetry, and programming; analyze dataset biases; and more. However, their flexibility and generative capabilities also raise misuse concerns. This report discusses OpenA Is work related to the release of its GPT-2 language model. It discusses staged release, which allows time between model releases to conduct risk and benefit analyses as model sizes increased. It also discusses ongoing partnership-based research and provides recommendations for better coordination and responsible publication in AI.

الحساب واللغة الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع