No Arabic abstract
Representation learning is currently a very hot topic in modern machine learning, mostly due to the great success of the deep learning methods. In particular low-dimensional representation which discriminates classes can not only enhance the classification procedure, but also make it faster, while contrary to the high-dimensional embeddings can be efficiently used for visual based exploratory data analysis. In this paper we propose Maximum Entropy Linear Manifold (MELM), a multidimensional generalization of Multithreshold Entropy Linear Classifier model which is able to find a low-dimensional linear data projection maximizing discriminativeness of projected classes. As a result we obtain a linear embedding which can be used for classification, class aware dimensionality reduction and data visualization. MELM provides highly discriminative 2D projections of the data which can be used as a method for constructing robust classifiers. We provide both empirical evaluation as well as some interesting theoretical properties of our objective function such us scale and affine transformation invariance, connections with PCA and bounding of the expected balanced accuracy error.
As an effective tool for two-dimensional data analysis, two-dimensional canonical correlation analysis (2DCCA) is not only capable of preserving the intrinsic structural information of original two-dimensional (2D) data, but also reduces the computational complexity effectively. However, due to the unsupervised nature, 2DCCA is incapable of extracting sufficient discriminatory representations, resulting in an unsatisfying performance. In this letter, we propose a complete discriminative tensor representation learning (CDTRL) method based on linear correlation analysis for analyzing 2D signals (e.g. images). This letter shows that the introduction of the complete discriminatory tensor representation strategy provides an effective vehicle for revealing, and extracting the discriminant representations across the 2D data sets, leading to improved results. Experimental results show that the proposed CDTRL outperforms state-of-the-art methods on the evaluated data sets.
Graph embedding is a central problem in social network analysis and many other applications, aiming to learn the vector representation for each node. While most existing approaches need to specify the neighborhood and the dependence form to the neighborhood, which may significantly degrades the flexibility of representation, we propose a novel graph node embedding method (namely GESF) via the set function technique. Our method can 1) learn an arbitrary form of representation function from neighborhood, 2) automatically decide the significance of neighbors at different distances, and 3) be applied to heterogeneous graph embedding, which may contain multiple types of nodes. Theoretical guarantee for the representation capability of our method has been proved for general homogeneous and heterogeneous graphs and evaluation results on benchmark data sets show that the proposed GESF outperforms the state-of-the-art approaches on producing node vectors for classification tasks.
Clustering is one of the fundamental problems in unsupervised learning. Recent deep learning based methods focus on learning clustering oriented representations. Among those methods, Variational Deep Embedding achieves great success in various clustering tasks by specifying a Gaussian Mixture prior to the latent space. However, VaDE suffers from two problems: 1) it is fragile to the input noise; 2) it ignores the locality information between the neighboring data points. In this paper, we propose a joint learning framework that improves VaDE with a robust embedding discriminator and a local structure constraint, which are both helpful to improve the robustness of our model. Experiment results on various vision and textual datasets demonstrate that our method outperforms the state-of-the-art baseline models in all metrics. Further detailed analysis shows that our proposed model is very robust to the adversarial inputs, which is a desirable property for practical applications.
In this paper, we present a new class of Markov decision processes (MDPs), called Tsallis MDPs, with Tsallis entropy maximization, which generalizes existing maximum entropy reinforcement learning (RL). A Tsallis MDP provides a unified framework for the original RL problem and RL with various types of entropy, including the well-known standard Shannon-Gibbs (SG) entropy, using an additional real-valued parameter, called an entropic index. By controlling the entropic index, we can generate various types of entropy, including the SG entropy, and a different entropy results in a different class of the optimal policy in Tsallis MDPs. We also provide a full mathematical analysis of Tsallis MDPs, including the optimality condition, performance error bounds, and convergence. Our theoretical result enables us to use any positive entropic index in RL. To handle complex and large-scale problems, we propose a model-free actor-critic RL method using Tsallis entropy maximization. We evaluate the regularization effect of the Tsallis entropy with various values of entropic indices and show that the entropic index controls the exploration tendency of the proposed method. For a different type of RL problems, we find that a different value of the entropic index is desirable. The proposed method is evaluated using the MuJoCo simulator and achieves the state-of-the-art performance.
Mixture models are an expressive hypothesis class that can approximate a rich set of policies. However, using mixture policies in the Maximum Entropy (MaxEnt) framework is not straightforward. The entropy of a mixture model is not equal to the sum of its components, nor does it have a closed-form expression in most cases. Using such policies in MaxEnt algorithms, therefore, requires constructing a tractable approximation of the mixture entropy. In this paper, we derive a simple, low-variance mixture-entropy estimator. We show that it is closely related to the sum of marginal entropies. Equipped with our entropy estimator, we derive an algorithmic variant of Soft Actor-Critic (SAC) to the mixture policy case and evaluate it on a series of continuous control tasks.