ﻻ يوجد ملخص باللغة العربية
Mixture of Experts (MoE) is a popular framework for modeling heterogeneity in data for regression, classification, and clustering. For regression and cluster analyses of continuous data, MoE usually use normal experts following the Gaussian distribution. However, for a set of data containing a group or groups of observations with heavy tails or atypical observations, the use of normal experts is unsuitable and can unduly affect the fit of the MoE model. We introduce a robust MoE modeling using the $t$ distribution. The proposed $t$ MoE (TMoE) deals with these issues regarding heavy-tailed and noisy data. We develop a dedicated expectation-maximization (EM) algorithm to estimate the parameters of the proposed model by monotonically maximizing the observed data log-likelihood. We describe how the presented model can be used in prediction and in model-based clustering of regression data. The proposed model is validated on numerical experiments carried out on simulated data, which show the effectiveness and the robustness of the proposed model in terms of modeling non-linear regression functions as well as in model-based clustering. Then, it is applied to the real-world data of tone perception for musical data analysis, and the one of temperature anomalies for the analysis of climate change data. The obtained results show the usefulness of the TMoE model for practical applications.
Mixture of Experts (MoE) is a popular framework in the fields of statistics and machine learning for modeling heterogeneity in data for regression, classification and clustering. MoE for continuous data are usually based on the normal distribution. H
Models based on multivariate t distributions are widely applied to analyze data with heavy tails. However, all the marginal distributions of the multivariate t distributions are restricted to have the same degrees of freedom, making these models unab
Sparsely-gated Mixture of Experts networks (MoEs) have demonstrated excellent scalability in Natural Language Processing. In Computer Vision, however, almost all performant networks are dense, that is, every input is processed by every parameter. We
Mixture of Experts (MoE) is a popular framework for modeling heterogeneity in data for regression, classification and clustering. For continuous data which we consider here in the context of regression and cluster analysis, MoE usually use normal exp
In many industrial applications like online advertising and recommendation systems, diverse and accurate user profiles can greatly help improve personalization. For building user profiles, deep learning is widely used to mine expressive tags to descr