ﻻ يوجد ملخص باللغة العربية
Regression mixture models are widely studied in statistics, machine learning and data analysis. Fitting regression mixtures is challenging and is usually performed by maximum likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the initialization is crucial for EM. If the initialization is inappropriately performed, the EM algorithm may lead to unsatisfactory results. The EM algorithm also requires the number of clusters to be given a priori; the problem of selecting the number of mixture components requires using model selection criteria to choose one from a set of pre-estimated candidate models. We propose a new fully unsupervised algorithm to learn regression mixture models with unknown number of components. The developed unsupervised learning approach consists in a penalized maximum likelihood estimation carried out by a robust expectation-maximization (EM) algorithm for fitting polynomial, spline and B-spline regressions mixtures. The proposed learning approach is fully unsupervised: 1) it simultaneously infers the model parameters and the optimal number of the regression mixture components from the data as the learning proceeds, rather than in a two-fold scheme as in standard model-based clustering using afterward model selection criteria, and 2) it does not require accurate initialization unlike the standard EM for regression mixtures. The developed approach is applied to curve clustering problems. Numerical experiments on simulated data show that the proposed robust EM algorithm performs well and provides accurate results in terms of robustness with regard initialization and retrieving the optimal partition with the actual number of clusters. An application to real data in the framework of functional data clustering, confirms the benefit of the proposed approach for practical applications.
We consider an equivariant approach imposing data-driven bounds for the variances to avoid singular and spurious solutions in maximum likelihood (ML) estimation of clusterwise linear regression models. We investigate its use in the choice of the numb
In this paper we extend the Weibull power series (WPS) class of distributions and named this new class as extended Weibull power series (EWPS) class of distributions. The EWPS distributions are related to series and parallel systems with a random num
We introduce a new approach to a linear-circular regression problem that relates multiple linear predictors to a circular response. We follow a modeling approach of a wrapped normal distribution that describes angular variables and angular distributi
The aim of this paper is to present a mixture composite regression model for claim severity modelling. Claim severity modelling poses several challenges such as multimodality, heavy-tailedness and systematic effects in data. We tackle this modelling
We consider the modeling of data generated by a latent continuous-time Markov jump process with a state space of finite but unknown dimensions. Typically in such models, the number of states has to be pre-specified, and Bayesian inference for a fixed