No Arabic abstract
We develop a method for analyzing spatiotemporal anomalies in geospatial data using topological data analysis (TDA). To do this, we use persistent homology (PH), a tool from TDA that allows one to algorithmically detect geometric voids in a data set and quantify the persistence of these voids. We construct an efficient filtered simplicial complex (FSC) such that the voids in our FSC are in one-to-one correspondence with the anomalies. Our approach goes beyond simply identifying anomalies; it also encodes information about the relationships between anomalies. We use vineyards, which one can interpret as time-varying persistence diagrams (an approach for visualizing PH), to track how the locations of the anomalies change over time. We conduct two case studies using spatially heterogeneous COVID-19 data. First, we examine vaccination rates in New York City by zip code. Second, we study a year-long data set of COVID-19 case rates in neighborhoods in the city of Los Angeles.
We propose a general technique for extracting a larger set of stable information from persistent homology computations than is currently done. The persistent homology algorithm is usually viewed as a procedure which starts with a filtered complex and ends with a persistence diagram. This procedure is stable (at least to certain types of perturbations of the input). This justifies the use of the diagram as a signature of the input, and the use of features derived from it in statistics and machine learning. However, these computations also produce other information of great interest to practitioners that is unfortunately unstable. For example, each point in the diagram corresponds to a simplex whose addition in the filtration results in the birth of the corresponding persistent homology class, but this correspondence is unstable. In addition, the persistence diagram is not stable with respect to other procedures that are employed in practice, such as thresholding a point cloud by density. We recast these problems as real-valued functions which are discontinuous but measurable, and then observe that convolving such a function with a suitable function produces a Lipschitz function. The resulting stable function can be estimated by perturbing the input and averaging the output. We illustrate this approach with a number of examples, including a stable localization of a persistent homology generator from brain imaging data.
Timely estimation of the current value for COVID-19 reproduction factor $R$ has become a key aim of efforts to inform management strategies. $R$ is an important metric used by policy-makers in setting mitigation levels and is also important for accurate modelling of epidemic progression. This brief paper introduces a method for estimating $R$ from biased case testing data. Using testing data, rather than hospitalisation or death data, provides a much earlier metric along the symptomatic progression scale. This can be hugely important when fighting the exponential nature of an epidemic. We develop a practical estimator and apply it to Scottish case testing data to infer a current (20 May 2020) $R$ value of $0.74$ with $95%$ confidence interval $[0.48 - 0.86]$.
Comparison between multidimensional persistent Betti numbers is often based on the multidimensional matching distance. While this metric is rather simple to define and compute by considering a suitable family of filtering functions associated with lines having a positive slope, it has two main drawbacks. First, it forgets the natural link between the homological properties of filtrations associated with lines that are close to each other. As a consequence, part of the interesting homological information is lost. Second, its intrinsically discontinuous definition makes it difficult to study its properties. In this paper we introduce a new matching distance for 2D persistent Betti numbers, called coherent matching distance and based on matchings that change coherently with the filtrations we take into account. Its definition is not trivial, as it must face the presence of monodromy in multidimensional persistence, i.e. the fact that different paths in the space parameterizing the above filtrations can induce different matchings between the associated persistent diagrams. In our paper we prove that the coherent 2D matching distance is well-defined and stable.
We propose a method, based on persistent homology, to uncover topological properties of a priori unknown covariates of neuron activity. Our input data consist of spike train measurements of a set of neurons of interest, a candidate list of the known stimuli that govern neuron activity, and the corresponding state of the animal throughout the experiment performed. Using a generalized linear model for neuron activity and simple assumptions on the effects of the external stimuli, we infer away any contribution to the observed spike trains by the candidate stimuli. Persistent homology then reveals useful information about any further, unknown, covariates.
Detecting the dimension of a hidden manifold from a point sample has become an important problem in the current data-driven era. Indeed, estimating the shape dimension is often the first step in studying the processes or phenomena associated to the data. Among the many dimension detection algorithms proposed in various fields, a few can provide theoretical guarantee on the correctness of the estimated dimension. However, the correctness usually requires certain regularity of the input: the input points are either uniformly randomly sampled in a statistical setting, or they form the so-called $(varepsilon,delta)$-sample which can be neither too dense nor too sparse. Here, we propose a purely topological technique to detect dimensions. Our algorithm is provably correct and works under a more relaxed sampling condition: we do not require uniformity, and we also allow Hausdorff noise. Our approach detects dimension by determining local homology. The computation of this topological structure is much less sensitive to the local distribution of points, which leads to the relaxation of the sampling conditions. Furthermore, by leveraging various developments in computational topology, we show that this local homology at a point $z$ can be computed emph{exactly} for manifolds using Vietoris-Rips complexes whose vertices are confined within a local neighborhood of $z$. We implement our algorithm and demonstrate the accuracy and robustness of our method using both synthetic and real data sets.