Deep Recursive Embedding for High-Dimensional Data


Abstract in English

t-distributed stochastic neighbor embedding (t-SNE) is a well-established visualization method for complex high-dimensional data. However, the original t-SNE method is nonparametric, stochastic, and often cannot well prevserve the global structure of data as it emphasizes local neighborhood. With t-SNE as a reference, we propose to combine the deep neural network (DNN) with the mathematical-grounded embedding rules for high-dimensional data embedding. We first introduce a deep embedding network (DEN) framework, which can learn a parametric mapping from high-dimensional space to low-dimensional embedding. DEN has a flexible architecture that can accommodate different input data (vector, image, or tensor) and loss functions. To improve the embedding performance, a recursive training strategy is proposed to make use of the latent representations extracted by DEN. Finally, we propose a two-stage loss function combining the advantages of two popular embedding methods, namely, t-SNE and uniform manifold approximation and projection (UMAP), for optimal visualization effect. We name the proposed method Deep Recursive Embedding (DRE), which optimizes DEN with a recursive training strategy and two-stage losse. Our experiments demonstrated the excellent performance of the proposed DRE method on high-dimensional data embedding, across a variety of public databases. Remarkably, our comparative results suggested that our proposed DRE could lead to improved global structure preservation.

Download