ترغب بنشر مسار تعليمي؟ اضغط هنا

Introducing user-prescribed constraints in Markov chains for nonlinear dimensionality reduction

104   0   0.0 ( 0 )
 نشر من قبل Purushottam Dixit
 تاريخ النشر 2018
والبحث باللغة English




اسأل ChatGPT حول البحث

Stochastic kernel based dimensionality reduction approaches have become popular in the last decade. The central component of many of these methods is a symmetric kernel that quantifies the vicinity between pairs of data points and a kernel-induced Markov chain on the data. Typically, the Markov chain is fully specified by the kernel through row normalization. However, in many cases, it is desirable to impose user-specified stationary-state and dynamical constraints on the Markov chain. Unfortunately, no systematic framework exists to impose such user-defined constraints. Here, we introduce a path entropy maximization based approach to derive the transition probabilities of Markov chains using a kernel and additional user-specified constraints. We illustrate the usefulness of these Markov chains with examples.

قيم البحث

اقرأ أيضاً

Manifold learning-based encoders have been playing important roles in nonlinear dimensionality reduction (NLDR) for data exploration. However, existing methods can often fail to preserve geometric, topological and/or distributional structures of data . In this paper, we propose a deep manifold learning framework, called deep manifold transformation (DMT) for unsupervised NLDR and embedding learning. DMT enhances deep neural networks by using cross-layer local geometry-preserving (LGP) constraints. The LGP constraints constitute the loss for deep manifold learning and serve as geometric regularizers for NLDR network training. Extensive experiments on synthetic and real-world data demonstrate that DMT networks outperform existing leading manifold-based NLDR methods in terms of preserving the structures of data.
The movement of large quantities of data during the training of a Deep Neural Network presents immense challenges for machine learning workloads. To minimize this overhead, especially on the movement and calculation of gradient information, we introd uce streaming batch principal component analysis as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic k-rank approximation of the network gradient. We demonstrate that the low rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini batch gradient descent. These results can lead to both improvements in the design of application specific integrated circuits for deep learning and in the speed of synchronization of machine learning models trained with data parallelism.
Locality preserving projections (LPP) are a classical dimensionality reduction method based on data graph information. However, LPP is still responsive to extreme outliers. LPP aiming for vectorial data may undermine data structural information when it is applied to multidimensional data. Besides, it assumes the dimension of data to be smaller than the number of instances, which is not suitable for high-dimensional data. For high-dimensional data analysis, the tensor-train decomposition is proved to be able to efficiently and effectively capture the spatial relations. Thus, we propose a tensor-train parameterization for ultra dimensionality reduction (TTPUDR) in which the traditional LPP mapping is tensorized in terms of tensor-trains and the LPP objective is replaced with the Frobenius norm to increase the robustness of the model. The manifold optimization technique is utilized to solve the new model. The performance of TTPUDR is assessed on classification problems and TTPUDR significantly outperforms the past methods and the several state-of-the-art methods.
Recently, a novel family of biologically plausible online algorithms for reducing the dimensionality of streaming data has been derived from the similarity matching principle. In these algorithms, the number of output dimensions can be determined ada ptively by thresholding the singular values of the input data matrix. However, setting such threshold requires knowing the magnitude of the desired singular values in advance. Here we propose online algorithms where the threshold is self-calibrating based on the singular values computed from the existing observations. To derive these algorithms from the similarity matching cost function we propose novel regularizers. As before, these online algorithms can be implemented by Hebbian/anti-Hebbian neural networks in which the learning rule depends on the chosen regularizer. We demonstrate both mathematically and via simulation the effectiveness of these online algorithms in various settings.
123 - Yanjun Li , Bihan Wen , Hao Cheng 2021
Low-dimensional embeddings for data from disparate sources play critical roles in multi-modal machine learning, multimedia information retrieval, and bioinformatics. In this paper, we propose a supervised dimensionality reduction method that learns l inear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities. We also propose an efficient feature selection method that complements, and can be applied prior to, our joint dimensionality reduction method. Assuming that there exist true linear embeddings for these features, our analysis of the error in the learned linear embeddings provides theoretical guarantees that the dimensionality reduction method accurately estimates the true embeddings when certain technical conditions are satisfied and the number of samples is sufficiently large. The derived sample complexity results are echoed by numerical experiments. We apply the proposed dimensionality reduction method to gene-disease association, and predict unknown associations using kernel regression on the dimension-reduced feature vectors. Our approach compares favorably against other dimensionality reduction methods, and against a state-of-the-art method of bilinear regression for predicting gene-disease associations.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا