Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

On Sharing Models Instead of Data using Mimic learning for Smart Health Applications

72 0 0.0 ( 0 )

Download Cite

Added by Mohamed Baza

Publication date 2019

fields Informatics Engineering Mathematical Statistics

and research's language is English

Authors Mohamed Baza - Andrew Salazar - Mohamed Mahmoud

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Electronic health records (EHR) systems contain vast amounts of medical information about patients. These data can be used to train machine learning models that can predict health status, as well as to help prevent future diseases or disabilities. However, getting patients medical data to obtain well-trained machine learning models is a challenging task. This is because sharing the patients medical records is prohibited by law in most countries due to patients privacy concerns. In this paper, we tackle this problem by sharing the models instead of the original sensitive data by using the mimic learning approach. The idea is first to train a model on the original sensitive data, called the teacher model. Then, using this model, we can transfer its knowledge to another model, called the student model, without the need to learn the original data used in training the teacher model. The student model is then shared to the public and can be used to make accurate predictions. To assess the mimic learning approach, we have evaluated our scheme using different medical datasets. The results indicate that the student model mimics the teacher model performance in terms of prediction accuracy without the need to access to the patients original data records.

rate research

Split learning for health: Distributed deep learning without sharing raw patient data

144 - Praneeth Vepakomma , Otkrist Gupta , Tristan Swedish 2018

Can health entities collaboratively train deep learning models without sharing sensitive raw data? This paper proposes several configurations of a distributed deep learning method called SplitNN to facilitate such collaborations. SplitNN does not share raw data or model details with collaborating institutions. The proposed configurations of splitNN cater to practical settings of i) entities holding different modalities of patient data, ii) centralized and local health entities collaborating on multiple tasks and iii) learning without sharing labels. We compare performance and resource efficiency trade-offs of splitNN and other distributed deep learning methods like federated learning, large batch synchronous stochastic gradient descent and show highly encouraging results for splitNN.

Machine Learning Machine Learning

MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III

143 - Shirly Wang , Matthew B. A. McDermott , Geeticka Chauhan 2019

Robust machine learning relies on access to data that can be used with standardized frameworks in important tasks and the ability to develop models whose performance can be reasonably reproduced. In machine learning for healthcare, the community faces reproducibility challenges due to a lack of publicly accessible data and a lack of standardized data processing frameworks. We present MIMIC-Extract, an open-source pipeline for transforming raw electronic health record (EHR) data for critical care patients contained in the publicly-available MIMIC-III database into dataframes that are directly usable in common machine learning pipelines. MIMIC-Extract addresses three primary challenges in making complex health records data accessible to the broader machine learning community. First, it provides standardized data processing functions, including unit conversion, outlier detection, and aggregating semantically equivalent features, thus accounting for duplication and reducing missingness. Second, it preserves the time series nature of clinical data and can be easily integrated into clinically actionable prediction tasks in machine learning for health. Finally, it is highly extensible so that other researchers with related questions can easily use the same pipeline. We demonstrate the utility of this pipeline by showcasing several benchmark tasks and baseline results.

Machine Learning Machine Learning

Enforceable Data Sharing Agreements Using Smart Contracts

70 - Kevin Liu , Harsh Desai , Lalana Kagal 2018

As more and more data is collected for various reasons, the sharing of such data becomes paramount to increasing its value. Many applications ranging from smart cities to personalized health care require individuals and organizations to share data at an unprecedented scale. Data sharing is crucial in todays world, but due to privacy reasons, security concerns and regulation issues, the conditions under which the sharing occurs needs to be carefully specified. Currently, this process is done by lawyers and requires the costly signing of legal agreements. In many cases, these data sharing agreements are hard to track, manage or enforce. In this work, we propose a novel alternative for tracking, managing and especially enforcing such data sharing agreements using smart contracts and blockchain technology. We design a framework that generates smart contracts from parameters based on legal data sharing agreements. The terms in these agreements are automatically enforced by the system. Monetary punishment can be employed using secure voting by external auditors to hold the violators accountable. Our experimental evaluation shows that our proposed framework is efficient and low-cost.

Computers and Society Cryptography and Security

On risk-based active learning for structural health monitoring

103 - A.J. Hughes , L.A. Bull , P. Gardner 2021

A primary motivation for the development and implementation of structural health monitoring systems, is the prospect of gaining the ability to make informed decisions regarding the operation and maintenance of structures and infrastructure. Unfortunately, descriptive labels for measured data corresponding to health-state information for the structure of interest are seldom available prior to the implementation of a monitoring system. This issue limits the applicability of the traditional supervised and unsupervised approaches to machine learning in the development of statistical classifiers for decision-supporting SHM systems. The current paper presents a risk-based formulation of active learning, in which the querying of class-label information is guided by the expected value of said information for each incipient data point. When applied to structural health monitoring, the querying of class labels can be mapped onto the inspection of a structure of interest in order to determine its health state. In the current paper, the risk-based active learning process is explained and visualised via a representative numerical example and subsequently applied to the Z24 Bridge benchmark. The results of the case studies indicate that a decision-makers performance can be improved via the risk-based active learning of a statistical classifier, such that the decision process itself is taken into account.

Machine Learning Machine Learning

Crowding Prediction of In-Situ Metro Passengers Using Smart Card Data

117 - Xiancai Tian , Chen Zhang , Baihua Zheng 2020

The metro system is playing an increasingly important role in the urban public transit network, transferring a massive human flow across space everyday in the city. In recent years, extensive research studies have been conducted to improve the service quality of metro systems. Among them, crowd management has been a critical issue for both public transport agencies and train operators. In this paper, by utilizing accumulated smart card data, we propose a statistical model to predict in-situ passenger density, i.e., number of on-board passengers between any two neighbouring stations, inside a closed metro system. The proposed model performs two main tasks: i) forecasting time-dependent Origin-Destination (OD) matrix by applying mature statistical models; and ii) estimating the travel time cost required by different parts of the metro network via truncated normal mixture distributions with Expectation-Maximization (EM) algorithm. Based on the prediction results, we are able to provide accurate prediction of in-situ passenger density for a future time point. A case study using real smart card data in Singapore Mass Rapid Transit (MRT) system demonstrate the efficacy and efficiency of our proposed method.

Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On Sharing Models Instead of Data using Mimic learning for Smart Health Applications

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions