No Arabic abstract
We consider averaging a number of candidate models to produce a prediction of lower risk in the context of partially linear functional additive models. These models incorporate the parametric effect of scalar variables and the additive effect of a functional variable to describe the relationship between a response variable and regressors. We develop a model averaging scheme that assigns the weights by minimizing a cross-validation criterion. Under the framework of model misspecification, the resulting estimator is proved to be asymptotically optimal in terms of the lowest possible square error loss for prediction. Also, simulation studies and real data analysis demonstrate the good performance of our proposed method.
This paper is concerned with model averaging estimation for partially linear functional score models. These models predict a scalar response using both parametric effect of scalar predictors and non-parametric effect of a functional predictor. Within this context, we develop a Mallows-type criterion for choosing weights. The resulting model averaging estimator is proved to be asymptotically optimal under certain regularity conditions in terms of achieving the smallest possible squared error loss. Simulation studies demonstrate its superiority or comparability to information criterion score-based model selection and averaging estimators. The proposed procedure is also applied to two real data sets for illustration. That the components of nonparametric part are unobservable leads to a more complicated situation than ordinary partially linear models (PLM) and a different theoretical derivation from those of PLM.
Partially linear additive models generalize the linear models since they model the relation between a response variable and covariates by assuming that some covariates are supposed to have a linear relation with the response but each of the others enter with unknown univariate smooth functions. The harmful effect of outliers either in the residuals or in the covariates involved in the linear component has been described in the situation of partially linear models, that is, when only one nonparametric component is involved in the model. When dealing with additive components, the problem of providing reliable estimators when atypical data arise, is of practical importance motivating the need of robust procedures. Hence, we propose a family of robust estimators for partially linear additive models by combining $B-$splines with robust linear regression estimators. We obtain consistency results, rates of convergence and asymptotic normality for the linear components, under mild assumptions. A Monte Carlo study is carried out to compare the performance of the robust proposal with its classical counterpart under different models and contamination schemes. The numerical experiments show the advantage of the proposed methodology for finite samples. We also illustrate the usefulness of the proposed approach on a real data set.
This paper considers the problem of variable selection in regression models in the case of functional variables that may be mixed with other type of variables (scalar, multivariate, directional, etc.). Our proposal begins with a simple null model and sequentially selects a new variable to be incorporated into the model based on the use of distance correlation proposed by cite{Szekely2007}. For the sake of simplicity, this paper only uses additive models. However, the proposed algorithm may assess the type of contribution (linear, non linear, ...) of each variable. The algorithm has shown quite promising results when applied to simulations and real data sets.
Aggregation of large databases in a specific format is a frequently used process to make the data easily manageable. Interval-valued data is one of the data types that is generated by such an aggregation process. Using traditional methods to analyze interval-valued data results in loss of information, and thus, several interval-valued data models have been proposed to gather reliable information from such data types. On the other hand, recent technological developments have led to high dimensional and complex data in many application areas, which may not be analyzed by traditional techniques. Functional data analysis is one of the most commonly used techniques to analyze such complex datasets. While the functional extensions of much traditional statistical techniques are available, the functional form of the interval-valued data has not been studied well. This paper introduces the functional forms of some well-known regression models that take interval-valued data. The proposed methods are based on the function-on-function regression model, where both the response and predictor/s are functional. Through several Monte Carlo simulations and empirical data analysis, the finite sample performance of the proposed methods is evaluated and compared with the state-of-the-art.
Historical Functional Linear Models (HFLM) quantify associations between a functional predictor and functional outcome where the predictor is an exposure variable that occurs before, or at least concurrently with, the outcome. Current work on the HFLM is largely limited to frequentist estimation techniques that employ spline-based basis representations. In this work, we propose a novel use of the discrete wavelet-packet transformation, which has not previously been used in functional models, to estimate historical relationships in a fully Bayesian model. Since inference has not been an emphasis of the existing work on HFLMs, we also employ two established Bayesian inference procedures in this historical functional setting. We investigate the operating characteristics of our wavelet-packet HFLM, as well as the two inference procedures, in simulation and use the model to analyze data on the impact of lagged exposure to particulate matter finer than 2.5$mu$g on heart rate variability in a cohort of journeyman boilermakers over the course of a days shift.