Making Classical Machine Learning Pipelines Differentiable: A Neural Translation Approach

344 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Gyeong-In Yu

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Gyeong-In Yu - Saeed Amizadeh - Sehoon Kim

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Classical Machine Learning (ML) pipelines often comprise of multiple ML models where models, within a pipeline, are trained in isolation. Conversely, when training neural network models, layers composing the neural models are simultaneously trained using backpropagation. We argue that the isolated training scheme of ML pipelines is sub-optimal, since it cannot jointly optimize multiple components. To this end, we propose a framework that translates a pre-trained ML pipeline into a neural network and fine-tunes the ML models within the pipeline jointly using backpropagation. Our experiments show that fine-tuning of the translated pipelines is a promising technique able to increase the final accuracy.

قيم البحث

321 - Weijia Xu , Xing Niu , Marine Carpuat 2019

Despite some empirical success at correcting exposure bias in machine translation, scheduled sampling algorithms suffer from a major drawback: they incorrectly assume that words in the reference translations and in sampled sequences are aligned at ea ch time step. Our new differentiable sampling algorithm addresses this issue by optimizing the probability that the reference can be aligned with the sampled output, based on a soft alignment predicted by the model itself. As a result, the output distribution at each time step is evaluated with respect to the whole predicted sequence. Experiments on IWSLT translation tasks show that our approach improves BLEU compared to maximum likelihood and scheduled sampling baselines. In addition, our approach is simpler to train with no need for sampling schedule and yields models that achieve larger improvements with smaller beam sizes.

الحساب واللغة التعلم الالي

A hybrid machine learning framework for analyzing human decision making through learning preferences

322 - Mengzhuo Guo , Qingpeng Zhang , Xiuwu Liao 2019

Machine learning has recently been widely adopted to address the managerial decision making problems, in which the decision maker needs to be able to interpret the contributions of individual attributes in an explicit form. However, there is a trade- off between performance and interpretability. Full complexity models are non-traceable black-box, whereas classic interpretable models are usually simplified with lower accuracy. This trade-off limits the application of state-of-the-art machine learning models in management problems, which requires high prediction performance, as well as the understanding of individual attributes contributions to the model outcome. Multiple criteria decision aiding (MCDA) is a family of analytic approaches to depicting the rationale of human decision. It is also limited by strong assumptions. To meet the decision makers demand for more interpretable machine learning models, we propose a novel hybrid method, namely Neural Network-based Multiple Criteria Decision Aiding, which combines an additive value model and a fully-connected multilayer perceptron (MLP) to achieve good performance while capturing the explicit relationships between individual attributes and the prediction. NN-MCDA has a linear component to characterize such relationships through providing explicit marginal value functions, and a nonlinear component to capture the implicit high-order interactions between attributes and their complex nonlinear transformations. We demonstrate the effectiveness of NN-MCDA with extensive simulation studies and three real-world datasets. To the best of our knowledge, this research is the first to enhance the interpretability of machine learning models with MCDA techniques. The proposed framework also sheds light on how to use machine learning techniques to free MCDA from strong assumptions.

التعلم الآلي التعلم الالي

Making AI Forget You: Data Deletion in Machine Learning

133 - Antonio Ginart , Melody Y. Guan , Gregory Valiant 2019

Intense recent discussions have focused on how to provide individuals with control over when their data can and cannot be used --- the EUs Right To Be Forgotten regulation is an example of this effort. In this paper we initiate a framework studying w hat to do when it is no longer permissible to deploy models derivative from specific user data. In particular, we formulate the problem of efficiently deleting individual data points from trained machine learning models. For many standard ML models, the only way to completely remove an individuals data is to retrain the whole model from scratch on the remaining data, which is often not computationally practical. We investigate algorithmic principles that enable efficient data deletion in ML. For the specific setting of k-means clustering, we propose two provably efficient deletion algorithms which achieve an average of over 100X improvement in deletion efficiency across 6 datasets, while producing clusters of comparable statistical quality to a canonical k-means++ baseline.

التعلم الآلي التعلم الالي

Prediction of GNSS Phase Scintillations: A Machine Learning Approach

171 - Kara Lamb , Garima Malhotra , Athanasios Vlontzos 2019

A Global Navigation Satellite System (GNSS) uses a constellation of satellites around the earth for accurate navigation, timing, and positioning. Natural phenomena like space weather introduce irregularities in the Earths ionosphere, disrupting the p ropagation of the radio signals that GNSS relies upon. Such disruptions affect both the amplitude and the phase of the propagated waves. No physics-based model currently exists to predict the time and location of these disruptions with sufficient accuracy and at relevant scales. In this paper, we focus on predicting the phase fluctuations of GNSS radio waves, known as phase scintillations. We propose a novel architecture and loss function to predict 1 hour in advance the magnitude of phase scintillations within a time window of plus-minus 5 minutes with state-of-the-art performance.

التعلم الآلي التعلم الالي

Public Health Informatics: Proposing Causal Sequence of Death Using Neural Machine Translation

80 - Yuanda Zhu , Ying Sha , Hang Wu 2020

Each year there are nearly 57 million deaths around the world, with over 2.7 million in the United States. Timely, accurate and complete death reporting is critical in public health, as institutions and government agencies rely on death reports to an alyze vital statistics and to formulate responses to communicable diseases. Inaccurate death reporting may result in potential misdirection of public health policies. Determining the causes of death is, nevertheless, challenging even for experienced physicians. To facilitate physicians in accurately reporting causes of death, we present an advanced AI approach to determine a chronically ordered sequence of clinical conditions that lead to death, based on decedents last hospital discharge record. The sequence of clinical codes on the death report is named as causal chain of death, coded in the tenth revision of International Statistical Classification of Diseases (ICD-10); in line with the ICD-9-CM Official Guidelines for Coding and Reporting, the priority-ordered clinical conditions on the discharge record are coded in ICD-9. We identify three challenges in proposing the causal chain of death: t

التعلم الآلي التعلم الالي