ترغب بنشر مسار تعليمي؟ اضغط هنا

Topological Machine Learning with Persistence Indicator Functions

75   0   0.0 ( 0 )
 نشر من قبل Bastian Rieck
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Techniques from computational topology, in particular persistent homology, are becoming increasingly relevant for data analysis. Their stable metrics permit the use of many distance-based data analysis methods, such as multidimensional scaling, while providing a firm theoretical ground. Many modern machine learning algorithms, however, are based on kernels. This paper presents persistence indicator functions (PIFs), which summarize persistence diagrams, i.e., feature descriptors in topological data analysis. PIFs can be calculated and compared in linear time and have many beneficial properties, such as the availability of a kernel-based similarity measure. We demonstrate their usage in common data analysis scenarios, such as confidence set estimation and classification of complex structured data.



قيم البحث

اقرأ أيضاً

We develop a framework for analyzing multivariate time series using topological data analysis (TDA) methods. The proposed methodology involves converting the multivariate time series to point cloud data, calculating Wasserstein distances between the persistence diagrams and using the $k$-nearest neighbors algorithm ($k$-NN) for supervised machine learning. Two methods (symmetry-breaking and anchor points) are also introduced to enable TDA to better analyze data with heterogeneous features that are sensitive to translation, rotation, or choice of coordinates. We apply our methods to room occupancy detection based on 5 time-dependent variables (temperature, humidity, light, CO2 and humidity ratio). Experimental results show that topological methods are effective in predicting room occupancy during a time window. We also apply our methods to an Activity Recognition dataset and obtained good results.
Topological data analysis is a relatively new branch of machine learning that excels in studying high dimensional data, and is theoretically known to be robust against noise. Meanwhile, data objects with mixed numeric and categorical attributes are u biquitous in real-world applications. However, topological methods are usually applied to point cloud data, and to the best of our knowledge there is no available framework for the classification of mixed data using topological methods. In this paper, we propose a novel topological machine learning method for mixed data classification. In the proposed method, we use theory from topological data analysis such as persistent homology, persistence diagrams and Wasserstein distance to study mixed data. The performance of the proposed method is demonstrated by experiments on a real-world heart disease dataset. Experimental results show that our topological method outperforms several state-of-the-art algorithms in the prediction of heart disease.
The theory of persistence modules is an emerging field of algebraic topology which originated in topological data analysis. In these notes we provide a concise introduction into this field and give an account on some of its interactions with geometry and analysis. In particular, we present applications of persistence to symplectic topology, including the geometry of symplectomorphism groups and embedding problems. Furthermore, we discuss topological function theory which provides a new insight on oscillation of functions. The material should be accessible to readers with a basic background in algebraic and differential topology.
Persistence modules are a central algebraic object arising in topological data analysis. The notion of interleaving provides a natural way to measure distances between persistence modules. We consider various classes of persistence modules, including many of those that have been previously studied, and describe the relationships between them. In the cases where these classes are sets, interleaving distance induces a topology. We undertake a systematic study the resulting topological spaces and their basic topological properties.
We introduce a refinement of the persistence diagram, the graded persistence diagram. It is the Mobius inversion of the graded rank function, which is obtained from the rank function using the unary numeral system. Both persistence diagrams and grade d persistence diagrams are integer-valued functions on the Cartesian plane. Whereas the persistence diagram takes non-negative values, the graded persistence diagram takes values of 0, 1, or -1. The sum of the graded persistence diagrams is the persistence diagram. We show that the positive and negative points in the k-th graded persistence diagram correspond to the local maxima and minima, respectively, of the k-th persistence landscape. We prove a stability theorem for graded persistence diagrams: the 1-Wasserstein distance between k-th graded persistence diagrams is bounded by twice the 1-Wasserstein distance between the corresponding persistence diagrams, and this bound is attained. In the other direction, the 1-Wasserstein distance is a lower bound for the sum of the 1-Wasserstein distances between the k-th graded persistence diagrams. In fact, the 1-Wasserstein distance for graded persistence diagrams is more discriminative than the 1-Wasserstein distance for the corresponding persistence diagrams.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا