ﻻ يوجد ملخص باللغة العربية
Multidimensional data distributions can have complex topologies and variable local dimensions. To approximate complex data, we propose a new type of low-dimensional ``principal object: a principal cubic complex. This complex is a generalization of linear and non-linear principal manifolds and includes them as a particular case. To construct such an object, we combine a method of topological grammars with the minimization of an elastic energy defined for its embedment into multidimensional data space. The whole complex is presented as a system of nodes and springs and as a product of one-dimensional continua (represented by graphs), and the grammars describe how these continua transform during the process of optimal complex construction. The simplest case of a topological grammar (``add a node, ``bisect an edge) is equivalent to the construction of ``principal trees, an object useful in many practical applications. We demonstrate how it can be applied to the analysis of bacterial genomes and for visualization of cDNA microarray data using the ``metro map representation. The preprint is supplemented by animation: ``How the topological grammar constructs branching principal components (AnimatedBranchingPCA.gif).
Principal manifolds are defined as lines or surfaces passing through ``the middle of data distribution. Linear principal manifolds (Principal Components Analysis) are routinely used for dimension reduction, noise filtering and data visualization. Rec
We provide a short introduction to the field of topological data analysis and discuss its possible relevance for the study of complex systems. Topological data analysis provides a set of tools to characterise the shape of data, in terms of the presen
We analyze the connectivity structure of weighted brain networks extracted from spontaneous magnetoencephalographic (MEG) signals of healthy subjects and epileptic patients (suffering from absence seizures) recorded at rest. We find that, for the act
Life and language are discrete combinatorial systems (DCSs) in which the basic building blocks are finite sets of elementary units: nucleotides or codons in a DNA sequence and letters or words in a language. Different combinations of these finite uni
An algorithm for optimization of signal significance or any other classification figure of merit suited for analysis of high energy physics (HEP) data is described. This algorithm trains decision trees on many bootstrap replicas of training data with