Fast DD-classification of functional data

691 0 0.0 ( 0 )

Download Cite

Added by Pavlo Mozharovskyi

Publication date 2014

fields Mathematical Statistics

and research's language is English

Authors Karl Mosler - Pavlo Mozharovskyi

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

A fast nonparametric procedure for classifying functional data is introduced. It consists of a two-step transformation of the original data plus a classifier operating on a low-dimensional hypercube. The functional data are first mapped into a finite-dimensional location-slope space and then transformed by a multivariate depth function into the $DD$-plot, which is a subset of the unit hypercube. This transformation yields a new notion of depth for functional data. Three alternative depth functions are employed for this, as well as two rules for the final classification on $[0,1]^q$. The resulting classifier has to be cross-validated over a small range of parameters only, which is restricted by a Vapnik-Cervonenkis bound. The entire methodology does not involve smoothing techniques, is completely nonparametric and allows to achieve Bayes optimality under standard distributional settings. It is robust, efficiently computable, and has been implemented in an R environment. Applicability of the new approach is demonstrated by simulations as well as a benchmark study.

rate research

The DD$^G$-classifier in the functional setting

315 - Juan A. Cuesta-Albertos , Manuel Febrero-Bande , Manuel Oviedo de lan Fuente 2015

The Maximum Depth was the first attempt to use data depths instead of multivariate raw data to construct a classification rule. Recently, the DD-classifier has solved several serious limitations of the Maximum Depth classifier but some issues still remain. This paper is devoted to extending the DD-classifier in the following ways: first, to surpass the limitation of the DD-classifier when more than two groups are involved. Second to apply regular classification methods (like $k$NN, linear or quadratic classifiers, recursive partitioning,...) to DD-plots to obtain useful insights through the diagnostics of these methods. And third, to integrate different sources of information (data depths or multivariate functional data) in a unified way in the classification procedure. Besides, as the DD-classifier trick is especially useful in the functional framework, an enhanced revision of several functional data depths is done in the paper. A simulation study and applications to some classical real datasets are also provided showing the power of the new proposal.

Methodology Applications

Classification of Functional Data with k-Nearest-Neighbor Ensembles by Fitting Constrained Multinomial Logit Models

121 - Karen Fuchs 2016

During the last decades, many methods for the analysis of functional data including classification methods have been developed. Nonetheless, there are issues that have not been adressed satisfactorily by currently available methods, as, for example, feature selection combined with variable selection when using multiple functional covariates. In this paper, a functional ensemble is combined with a penalized and constrained multinomial logit model. It is shown that this synthesis yields a powerful classification tool for functional data (possibly mixed with non-functional predictors), which also provides automatic variable selection. The choice of an appropriate, sparsity-inducing penalty allows to estimate most model coefficients to exactly zero, and permits class-specific coefficients in multiclass problems, such that feature selection is obtained. An additional constraint within the multinomial logit model ensures that the model coefficients can be considered as weights. Thus, the estimation results become interpretable with respect to the discriminative importance of the selected features, which is rated by a feature importance measure. In two application examples, data of a cell chip used for water quality monitoring experiments and phoneme data used for speech recognition, the interpretability as well as the selection results are examined. The classification performance is compared to various other classification approaches which are in common use.

Methodology Applications

Online EM for Functional Data

390 - Florian Maire , Eric Moulines , Sidonie Lefebvre 2016

A novel approach to perform unsupervised sequential learning for functional data is proposed. Our goal is to extract reference shapes (referred to as templates) from noisy, deformed and censored realizations of curves and images. Our model generalizes the Bayesian dense deformable template model (Allassonni`ere et al., 2007), a hierarchical model in which the template is the function to be estimated and the deformation is a nuisance, assumed to be random with a known prior distribution. The templates are estimated using a Monte Carlo version of the online Expectation-Maximization algorithm, extending the work from Cappe and Moulines (2009). Our sequential inference framework is significantly more computationally efficient than equivalent batch learning algorithms, especially when the missing data is high-dimensional. Some numerical illustrations on curve registration problem and templates extraction from images are provided to support our findings.

Methodology

Rank Dynamics for Functional Data

159 - Yaqing Chen , Matthew Dawson , Hans-Georg Muller 2018

The study of the dynamic behavior of cross-sectional ranks over time for functional data and the ranks of the observed curves at each time point and their temporal evolution can yield valuable insights into the time dynamics of functional data. This approach is of interest in various application areas. For the analysis of the dynamics of ranks, estimation of the cross-sectional ranks of functional data is a first step. Several statistics of interest for ranked functional data are proposed. To quantify the evolution of ranks over time, a model for rank derivatives is introduced, where rank dynamics are decomposed into two components. One component corresponds to population changes and the other to individual changes that both affect the rank trajectories of individuals. The joint asymptotic normality for suitable estimates of these two components is established. The proposed approaches are illustrated with simulations and three longitudinal data sets: Growth curves obtained from the Zurich Longitudinal Growth Study, monthly house price data in the US from 1996 to 2015 and Major League Baseball offensive data for the 2017 season.

Methodology

General notions of depth for functional data

316 - Karl Mosler , Yulia Polyakova 2012

A data depth measures the centrality of a point with respect to an empirical distribution. Postulates are formulated, which a depth for functional data should satisfy, and a general approach is proposed to construct multivariate data depths in Banach spaces. The new approach, mentioned as Phi-depth, is based on depth infima over a proper set Phi of R^d-valued linear functions. Several desirable properties are established for the Phi-depth and a generalized version of it. The general notions include many new depths as special cases. In particular a location-slope depth and a principal component depth are introduced.

Methodology