No Arabic abstract
The study of the dynamic behavior of cross-sectional ranks over time for functional data and the ranks of the observed curves at each time point and their temporal evolution can yield valuable insights into the time dynamics of functional data. This approach is of interest in various application areas. For the analysis of the dynamics of ranks, estimation of the cross-sectional ranks of functional data is a first step. Several statistics of interest for ranked functional data are proposed. To quantify the evolution of ranks over time, a model for rank derivatives is introduced, where rank dynamics are decomposed into two components. One component corresponds to population changes and the other to individual changes that both affect the rank trajectories of individuals. The joint asymptotic normality for suitable estimates of these two components is established. The proposed approaches are illustrated with simulations and three longitudinal data sets: Growth curves obtained from the Zurich Longitudinal Growth Study, monthly house price data in the US from 1996 to 2015 and Major League Baseball offensive data for the 2017 season.
We propose a multivariate functional responses low rank regression model with possible high dimensional functional responses and scalar covariates. By expanding the slope functions on a set of sieve basis, we reconstruct the basis coefficients as a matrix. To estimate these coefficients, we propose an efficient procedure using nuclear norm regularization. We also derive error bounds for our estimates and evaluate our method using simulations. We further apply our method to the Human Connectome Project neuroimaging data to predict cortical surface motor task-evoked functional magnetic resonance imaging signals using various clinical covariates to illustrate the usefulness of our results.
A novel approach to perform unsupervised sequential learning for functional data is proposed. Our goal is to extract reference shapes (referred to as templates) from noisy, deformed and censored realizations of curves and images. Our model generalizes the Bayesian dense deformable template model (Allassonni`ere et al., 2007), a hierarchical model in which the template is the function to be estimated and the deformation is a nuisance, assumed to be random with a known prior distribution. The templates are estimated using a Monte Carlo version of the online Expectation-Maximization algorithm, extending the work from Cappe and Moulines (2009). Our sequential inference framework is significantly more computationally efficient than equivalent batch learning algorithms, especially when the missing data is high-dimensional. Some numerical illustrations on curve registration problem and templates extraction from images are provided to support our findings.
We propose a nonparametric method to explicitly model and represent the derivatives of smooth underlying trajectories for longitudinal data. This representation is based on a direct Karhunen--Lo`eve expansion of the unobserved derivatives and leads to the notion of derivative principal component analysis, which complements functional principal component analysis, one of the most popular tools of functional data analysis. The proposed derivative principal component scores can be obtained for irregularly spaced and sparsely observed longitudinal data, as typically encountered in biomedical studies, as well as for functional data which are densely measured. Novel consistency results and asymptotic convergence rates for the proposed estimates of the derivative principal component scores and other components of the model are derived under a unified scheme for sparse or dense observations and mild conditions. We compare the proposed representations for derivatives with alternative approaches in simulation settings and also in a wallaby growth curve application. It emerges that representations using the proposed derivative principal component analysis recover the underlying derivatives more accurately compared to principal component analysis-based approaches especially in settings where the functional data are represented with only a very small number of components or are densely sampled. In a second wheat spectra classification example, derivative principal component scores were found to be more predictive for the protein content of wheat than the conventional functional principal component scores.
We propose an alternative to $k$-nearest neighbors for functional data whereby the approximating neighboring curves are piecewise functions built from a functional sample. Using a locally defined distance function that satisfies stabilization criteria, we establish pointwise and global approximation results in function spaces when the number of data curves is large enough. We exploit this feature to develop the asymptotic theory when a finite number of curves is observed at time-points given by an i.i.d. sample whose cardinality increases up to infinity. We use these results to investigate the problem of estimating unobserved segments of a partially observed functional data sample as well as to study the problem of functional classification and outlier detection. For such problems, our methods are competitive with and sometimes superior to benchmark predictions in the field.
We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors, to achieve tailored dimension reduction and facilitate interpretation/visualization of the resulting functional model. Our approach is based on a two-level low-rank structure imposed on the functional regression surfaces. A global low-rank structure identifies a small set of latent principal functional responses and predictors that drives the underlying regression association. A local low-rank structure then controls the complexity and smoothness of the association between the principal functional responses and predictors. Through a basis expansion approach, the functional problem boils down to an interesting integrated matrix approximation task, where the blocks or submatrices of an integrated low-rank matrix share some common row space and/or column space. An iterative algorithm with convergence guarantee is developed. We establish the consistency of NRRR and also show through non-asymptotic analysis that it can achieve at least a comparable error rate to that of the reduced-rank regression. Simulation studies demonstrate the effectiveness of NRRR. We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.