In this chapter, we discuss applications of topological data analysis (TDA) to spatial systems. We briefly review the recently proposed level-set construction of filtered simplicial complexes, and we then examine persistent homology in two cases studies: street networks in Shanghai and hotspots of COVID-19 infections. We then summarize our results and provide an outlook on TDA in spatial systems.
Persistent homology is a vital tool for topological data analysis. Previous work has developed some statistical estimators for characteristics of collections of persistence diagrams. However, tools that provide statistical inference for observations that are persistence diagrams are limited. Specifically, there is a need for tests that can assess the strength of evidence against a claim that two samples arise from the same population or process. We propose the use of randomization-style null hypothesis significance tests (NHST) for these situations. The test is based on a loss function that comprises pairwise distances between the elements of each sample and all the elements in the other sample. We use this method to analyze a range of simulated and experimental data. Through these examples we experimentally explore the power of the p-values. Our results show that the randomization-style NHST based on pairwise distances can distinguish between samples from different processes, which suggests that its use for hypothesis tests upon persistence diagrams is reasonable. We demonstrate its application on a real dataset of fMRI data of patients with ADHD.
Jeongganbo is a unique music representation invented by Sejong the Great. Contrary to the western music notation, the pitch of each note is encrypted and the length is visualized directly in a matrix form in Jeongganbo. We use topological data analysis (TDA) to analyze the Korean music written in Jeongganbo for Suyeonjang, Songuyeo, and Taryong, those well-known pieces played at the palace and among noble community. We are particularly interested in the cycle structure. We first define and determine the node elements of each music, characterized uniquely with its pitch and length. Then we transform the music into a graph and define the distance between the nodes as their adjacent occurrence rate. The graph is used as a point cloud whose homological structure is investigated by measuring the hole structure in each dimension. We identify cycles of each music, match those in Jeongganbo, and show how those cycles are interconnected. The main discovery of this work is that the cycles of Suyeonjang and Songuyeo, categorized as a special type of cyclic music known as Dodeuri, frequently overlap each other when appearing in the music while the cycles found in Taryong, which does not belong to Dodeuri class, appear individually.
Topological Data Analysis is a recent and fast growing field providing a set of new topological and geometric tools to infer relevant features for possibly complex data. This paper is a brief introduction, through a few selected topics, to basic fundamental and practical aspects of tda for non experts.
Multivector fields provide an avenue for studying continuous dynamical systems in a combinatorial framework. There are currently two approaches in the literature which use persistent homology to capture changes in combinatorial dynamical systems. The first captures changes in the Conley index, while the second captures changes in the Morse decomposition. However, such approaches have limitations. The former approach only describes how the Conley index changes across a selected isolated invariant set though the dynamics can be much more complicated than the behavior of a single isolated invariant set. Likewise, considering a Morse decomposition omits much information about the individual Morse sets. In this paper, we propose a method to summarize changes in combinatorial dynamical systems by capturing changes in the so-called Conley-Morse graphs. A Conley-Morse graph contains information about both the structure of a selected Morse decomposition and about the Conley index at each Morse set in the decomposition. Hence, our method summarizes the changing structure of a sequence of dynamical systems at a finer granularity than previous approaches.
Deep generative models have emerged as a powerful tool for learning informative molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. Deep generative auto-encoders defined over molecular SMILES strings have been a popular choice for that purpose. However, capturing salient molecular properties like quantum-chemical energies remains challenging and requires sophisticated neural net models of molecular graphs or geometry-based information. As a simpler and more efficient alternative, we present a SMILES Variational Auto-Encoder (VAE) augmented with topological data analysis (TDA) representations of molecules, known as persistence images. Our experiments show that this TDA augmentation enables a SMILES VAE to capture the complex relation between 3D geometry and electronic properties, and allows generation of novel, diverse, and valid molecules with geometric features consistent with the training data, which exhibit a varying range of global electronic structural properties, such as a small HOMO-LUMO gap - a critical property for designing organic solar cells. We demonstrate that our TDA augmentation yields better success in downstream tasks compared to models trained without these representations and can assist in targeted molecule discovery.