Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists

105 0 0.0 ( 0 )

Download Cite

Added by Bertrand Michel

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Frederic Chazal

Statistics Theory Machine Learning Algebraic Topology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Topological Data Analysis is a recent and fast growing field providing a set of new topological and geometric tools to infer relevant features for possibly complex data. This paper is a brief introduction, through a few selected topics, to basic fundamental and practical aspects of tda for non experts.

rate research

A Gradient Sampling Algorithm for Stratified Maps with Applications to Topological Data Analysis

132 - Jacob Leygonie , Mathieu Carri`ere 2021

We introduce a novel gradient descent algorithm extending the well-known Gradient Sampling methodology to the class of stratifiably smooth objective functions, which are defined as locally Lipschitz functions that are smooth on some regular pieces-called the strata-of the ambient Euclidean space. For this class of functions, our algorithm achieves a sub-linear convergence rate. We then apply our method to objective functions based on the (extended) persistent homology map computed over lower-star filters, which is a central tool of Topological Data Analysis. For this, we propose an efficient exploration of the corresponding stratification by using the Cayley graph of the permutation group. Finally, we provide benchmark and novel topological optimization problems, in order to demonstrate the utility and applicability of our framework.

Computational Geometry Machine Learning Algebraic Topology

Hypothesis Testing for Topological Data Analysis

303 - Andrew Robinson , Katharine Turner 2013

Persistent homology is a vital tool for topological data analysis. Previous work has developed some statistical estimators for characteristics of collections of persistence diagrams. However, tools that provide statistical inference for observations that are persistence diagrams are limited. Specifically, there is a need for tests that can assess the strength of evidence against a claim that two samples arise from the same population or process. We propose the use of randomization-style null hypothesis significance tests (NHST) for these situations. The test is based on a loss function that comprises pairwise distances between the elements of each sample and all the elements in the other sample. We use this method to analyze a range of simulated and experimental data. Through these examples we experimentally explore the power of the p-values. Our results show that the randomization-style NHST based on pairwise distances can distinguish between samples from different processes, which suggests that its use for hypothesis tests upon persistence diagrams is reasonable. We demonstrate its application on a real dataset of fMRI data of patients with ADHD.

Applications Computational Geometry Algebraic Topology

PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction

280 - Nicolas Swenson , Aditi S. Krishnapriyan , Aydin Buluc 2020

Understanding protein structure-function relationships is a key challenge in computational biology, with applications across the biotechnology and pharmaceutical industries. While it is known that protein structure directly impacts protein function, many functional prediction tasks use only protein sequence. In this work, we isolate protein structure to make functional annotations for proteins in the Protein Data Bank in order to study the expressiveness of different structure-based prediction schemes. We present PersGNN - an end-to-end trainable deep learning model that combines graph representation learning with topological data analysis to capture a complex set of both local and global structural features. While variations of these techniques have been successfully applied to proteins before, we demonstrate that our hybridized approach, PersGNN, outperforms either method on its own as well as a baseline neural network that learns from the same information. PersGNN achieves a 9.3% boost in area under the precision recall curve (AUPR) compared to the best individual model, as well as high F1 scores across different gene ontology categories, indicating the transferability of this approach.

Biomolecules Machine Learning Algebraic Topology

Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations

528 - Yair Schiff , Vijil Chenthamarakshan , Samuel Hoffman 2021

Deep generative models have emerged as a powerful tool for learning informative molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. Deep generative auto-encoders defined over molecular SMILES strings have been a popular choice for that purpose. However, capturing salient molecular properties like quantum-chemical energies remains challenging and requires sophisticated neural net models of molecular graphs or geometry-based information. As a simpler and more efficient alternative, we present a SMILES Variational Auto-Encoder (VAE) augmented with topological data analysis (TDA) representations of molecules, known as persistence images. Our experiments show that this TDA augmentation enables a SMILES VAE to capture the complex relation between 3D geometry and electronic properties, and allows generation of novel, diverse, and valid molecules with geometric features consistent with the training data, which exhibit a varying range of global electronic structural properties, such as a small HOMO-LUMO gap - a critical property for designing organic solar cells. We demonstrate that our TDA augmentation yields better success in downstream tasks compared to models trained without these representations and can assist in targeted molecule discovery.

Chemical Physics Machine Learning Algebraic Topology

Topological Data Analysis of Spatial Systems

402 - Michelle Feng , Abigail Hickok , 2021

In this chapter, we discuss applications of topological data analysis (TDA) to spatial systems. We briefly review the recently proposed level-set construction of filtered simplicial complexes, and we then examine persistent homology in two cases studies: street networks in Shanghai and hotspots of COVID-19 infections. We then summarize our results and provide an outlook on TDA in spatial systems.

Social and Information Networks Computational Geometry Algebraic Topology

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions