Topological Machine Learning with Persistence Indicator Functions

75 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Bastian Rieck

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Bastian Rieck - Filip Sadlo - Heike Leitte

الطوبولوجيا الجبرية التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Techniques from computational topology, in particular persistent homology, are becoming increasingly relevant for data analysis. Their stable metrics permit the use of many distance-based data analysis methods, such as multidimensional scaling, while providing a firm theoretical ground. Many modern machine learning algorithms, however, are based on kernels. This paper presents persistence indicator functions (PIFs), which summarize persistence diagrams, i.e., feature descriptors in topological data analysis. PIFs can be calculated and compared in linear time and have many beneficial properties, such as the availability of a kernel-based similarity measure. We demonstrate their usage in common data analysis scenarios, such as confidence set estimation and classification of complex structured data.

قيم البحث

69 - Chengyuan Wu , Carol Anne Hargreaves 2019

We develop a framework for analyzing multivariate time series using topological data analysis (TDA) methods. The proposed methodology involves converting the multivariate time series to point cloud data, calculating Wasserstein distances between the persistence diagrams and using the $k$-nearest neighbors algorithm ($k$-NN) for supervised machine learning. Two methods (symmetry-breaking and anchor points) are also introduced to enable TDA to better analyze data with heterogeneous features that are sensitive to translation, rotation, or choice of coordinates. We apply our methods to room occupancy detection based on 5 time-dependent variables (temperature, humidity, light, CO2 and humidity ratio). Experimental results show that topological methods are effective in predicting room occupancy during a time window. We also apply our methods to an Activity Recognition dataset and obtained good results.

الطوبولوجيا الجبرية التعلم الآلي معالجة الإشارات

Topological Machine Learning for Mixed Numeric and Categorical Data

65 - Chengyuan Wu , Carol Anne Hargreaves 2020

Topological data analysis is a relatively new branch of machine learning that excels in studying high dimensional data, and is theoretically known to be robust against noise. Meanwhile, data objects with mixed numeric and categorical attributes are u biquitous in real-world applications. However, topological methods are usually applied to point cloud data, and to the best of our knowledge there is no available framework for the classification of mixed data using topological methods. In this paper, we propose a novel topological machine learning method for mixed data classification. In the proposed method, we use theory from topological data analysis such as persistent homology, persistence diagrams and Wasserstein distance to study mixed data. The performance of the proposed method is demonstrated by experiments on a real-world heart disease dataset. Experimental results show that our topological method outperforms several state-of-the-art algorithms in the prediction of heart disease.

الطوبولوجيا الجبرية التعلم الآلي

Topological Persistence in Geometry and Analysis

52 - Leonid Polterovich , Daniel Rosen , Karina Samvelyan 2019

The theory of persistence modules is an emerging field of algebraic topology which originated in topological data analysis. In these notes we provide a concise introduction into this field and give an account on some of its interactions with geometry and analysis. In particular, we present applications of persistence to symplectic topology, including the geometry of symplectomorphism groups and embedding problems. Furthermore, we discuss topological function theory which provides a new insight on oscillation of functions. The material should be accessible to readers with a basic background in algebraic and differential topology.

الطوبولوجيا الجبرية التحليل الكلاسيكي و ODEs علم الهندسة

Topological spaces of persistence modules and their properties

150 - Peter Bubenik , Tane Vergili 2018

Persistence modules are a central algebraic object arising in topological data analysis. The notion of interleaving provides a natural way to measure distances between persistence modules. We consider various classes of persistence modules, including many of those that have been previously studied, and describe the relationships between them. In the cases where these classes are sets, interleaving distance induces a topology. We undertake a systematic study the resulting topological spaces and their basic topological properties.

الطوبولوجيا الجبرية طوبولوجيا عامة

Graded persistence diagrams and persistence landscapes

132 - Leo Betthauser , Peter Bubenik , Parker B. Edwards 2019

We introduce a refinement of the persistence diagram, the graded persistence diagram. It is the Mobius inversion of the graded rank function, which is obtained from the rank function using the unary numeral system. Both persistence diagrams and grade d persistence diagrams are integer-valued functions on the Cartesian plane. Whereas the persistence diagram takes non-negative values, the graded persistence diagram takes values of 0, 1, or -1. The sum of the graded persistence diagrams is the persistence diagram. We show that the positive and negative points in the k-th graded persistence diagram correspond to the local maxima and minima, respectively, of the k-th persistence landscape. We prove a stability theorem for graded persistence diagrams: the 1-Wasserstein distance between k-th graded persistence diagrams is bounded by twice the 1-Wasserstein distance between the corresponding persistence diagrams, and this bound is attained. In the other direction, the 1-Wasserstein distance is a lower bound for the sum of the 1-Wasserstein distances between the k-th graded persistence diagrams. In fact, the 1-Wasserstein distance for graded persistence diagrams is more discriminative than the 1-Wasserstein distance for the corresponding persistence diagrams.

الطوبولوجيا الجبرية