Receiver Operating Characteristic Curves and Confidence Bands for Support Vector Machines

121 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Daniel Luckett

تاريخ النشر 2018

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Daniel J. Luckett - Eric B. Laber - Samer S. El-Kamary

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Many problems that appear in biomedical decision making, such as diagnosing disease and predicting response to treatment, can be expressed as binary classification problems. The costs of false positives and false negatives vary across application domains and receiver operating characteristic (ROC) curves provide a visual representation of this trade-off. Nonparametric estimators for the ROC curve, such as a weighted support vector machine (SVM), are desirable because they are robust to model misspecification. While weighted SVMs have great potential for estimating ROC curves, their theoretical properties were heretofore underdeveloped. We propose a method for constructing confidence bands for the SVM ROC curve and provide the theoretical justification for the SVM ROC curve by showing that the risk function of the estimated decision rule is uniformly consistent across the weight parameter. We demonstrate the proposed confidence band method and the superior sensitivity and specificity of the weighted SVM compared to commonly used methods in diagnostic medicine using simulation studies. We present two illustrative examples: diagnosis of hepatitis C and a predictive model for treatment response in breast cancer.

قيم البحث

403 - Kohei Ogawa , Yoshiki Suzuki , Shinya Suzumura 2014

Sparse classifiers such as the support vector machines (SVM) are efficient in test-phases because the classifier is characterized only by a subset of the samples called support vectors (SVs), and the rest of the samples (non SVs) have no influence on the classification result. However, the advantage of the sparsity has not been fully exploited in training phases because it is generally difficult to know which sample turns out to be SV beforehand. In this paper, we introduce a new approach called safe sample screening that enables us to identify a subset of the non-SVs and screen them out prior to the training phase. Our approach is different from existing heuristic approaches in the sense that the screened samples are guaranteed to be non-SVs at the optimal solution. We investigate the advantage of the safe sample screening approach through intensive numerical experiments, and demonstrate that it can substantially decrease the computational cost of the state-of-the-art SVM solvers such as LIBSVM. In the current big data era, we believe that safe sample screening would be of great practical importance since the data size can be reduced without sacrificing the optimality of the final solution.

التعلم الالي

Support vector machines for learning reactive islands

394 - Shibabrat Naik , Vladimir Krajv{n}ak , Stephen Wiggins 2021

We develop a machine learning framework that can be applied to data sets derived from the trajectories of Hamiltons equations. The goal is to learn the phase space structures that play the governing role for phase space transport relevant to particul ar applications. Our focus is on learning reactive islands in two degrees-of-freedom Hamiltonian systems. Reactive islands are constructed from the stable and unstable manifolds of unstable periodic orbits and play the role of quantifying transition dynamics. We show that support vector machines (SVM) is an appropriate machine learning framework for this purpose as it provides an approach for finding the boundaries between qualitatively distinct dynamical behaviors, which is in the spirit of the phase space transport framework. We show how our method allows us to find reactive islands directly in the sense that we do not have to first compute unstable periodic orbits and their stable and unstable manifolds. We apply our approach to the Henon-Heiles Hamiltonian system, which is a benchmark system in the dynamical systems community. We discuss different sampling and learning approaches and their advantages and disadvantages.

النظم الديناميكية التعلم الآلي الفيزياء الكيميائية

Support Vector Machines and Generalisation in HEP

145 - A. Bethani , A. J. Bevan , J. Hays 2016

We review the concept of support vector machines (SVMs) and discuss examples of their use. One of the benefits of SVM algorithms, compared with neural networks and decision trees is that they can be less susceptible to over fitting than those other a lgorithms are to over training. This issue is related to the generalisation of a multivariate algorithm (MVA); a problem that has often been overlooked in particle physics. We discuss cross validation and how this can be used to improve the generalisation of a MVA in the context of High Energy Physics analyses. The examples presented use the Toolkit for Multivariate Analysis (TMVA) based on ROOT and describe our improvements to the SVM functionality and new tools introduced for cross validation within this framework.

تحليل البيانات والإحصاءات والاحتمال التعلم الآلي فيزياء الطاقة العالية - التجربة

A Rapid Pattern-Recognition Method for Driving Types Using Clustering-Based Support Vector Machines

83 - Wenshuo Wang , Junqiang Xi 2016

A rapid pattern-recognition approach to characterize drivers curve-negotiating behavior is proposed. To shorten the recognition time and improve the recognition of driving styles, a k-means clustering-based support vector machine ( kMC-SVM) method is developed and used for classifying drivers into two types: aggressive and moderate. First, vehicle speed and throttle opening are treated as the feature parameters to reflect the driving styles. Second, to discriminate driver curve-negotiating behaviors and reduce the number of support vectors, the k-means clustering method is used to extract and gather the two types of driving data and shorten the recognition time. Then, based on the clustering results, a support vector machine approach is utilized to generate the hyperplane for judging and predicting to which types the human driver are subject. Lastly, to verify the validity of the kMC-SVM method, a cross-validation experiment is designed and conducted. The research results show that the $ k $MC-SVM is an effective method to classify driving styles with a short time, compared with SVM method.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Accurate Streaming Support Vector Machines

356 - Vikram Nathan , Sharath Raghvendra 2014

A widely-used tool for binary classification is the Support Vector Machine (SVM), a supervised learning technique that finds the maximum margin linear separator between the two classes. While SVMs have been well studied in the batch (offline) setting , there is considerably less work on the streaming (online) setting, which requires only a single pass over the data using sub-linear space. Existing streaming algorithms are not yet competitive with the batch implementation. In this paper, we use the formulation of the SVM as a minimum enclosing ball (MEB) problem to provide a streaming SVM algorithm based off of the blurred ball cover originally proposed by Agarwal and Sharathkumar. Our implementation consistently outperforms existing streaming SVM approaches and provides higher accuracies than libSVM on several datasets, thus making it competitive with the standard SVM batch implementation.

التعلم الآلي