No Arabic abstract
We study a scalar-on-function historical linear regression model which assumes that the functional predictor does not influence the response when the time passes a certain cutoff point. We approach this problem from the perspective of locally sparse modeling, where a function is locally sparse if it is zero on a substantial portion of its defining domain. In the historical linear model, the slope function is exactly a locally sparse function that is zero beyond the cutoff time. A locally sparse estimate then gives rise to an estimate of the cutoff time. We propose a nested group bridge penalty that is able to specifically shrink the tail of a function. Combined with the B-spline basis expansion and penalized least squares, the nested group bridge approach can identify the cutoff time and produce a smooth estimate of the slope function simultaneously. The proposed locally sparse estimator is shown to be consistent, while its numerical performance is illustrated by simulation studies. The proposed method is demonstrated with an application of determining the effect of the past engine acceleration on the current particulate matter emission.
The conventional historical functional linear model relates the current value of the functional response at time t to all past values of the functional covariate up to time t. Motivated by situations where it is more reasonable to assume that only recent, instead of all, past values of the functional covariate have an impact on the functional response, we investigate in this work the historical functional linear model with an unknown forward time lag into the history. Besides the common goal of estimating the bivariate regression coefficient function, we also aim to identify the historical time lag from the data, which is important in many applications. Tailored for this purpose, we propose an estimation procedure adopting the finite element method to conform naturally to the trapezoidal domain of the bivariate coefficient function. A nested group bridge penalty is developed to provide simultaneous estimation of the bivariate coefficient function and the historical lag. The method is demonstrated in a real data example investigating the effect of muscle activation recorded via the noninvasive electromyography (EMG) method on lip acceleration during speech production. The finite sample performance of our proposed method is examined via simulation studies in comparison with the conventional method.
Historical Functional Linear Models (HFLM) quantify associations between a functional predictor and functional outcome where the predictor is an exposure variable that occurs before, or at least concurrently with, the outcome. Current work on the HFLM is largely limited to frequentist estimation techniques that employ spline-based basis representations. In this work, we propose a novel use of the discrete wavelet-packet transformation, which has not previously been used in functional models, to estimate historical relationships in a fully Bayesian model. Since inference has not been an emphasis of the existing work on HFLMs, we also employ two established Bayesian inference procedures in this historical functional setting. We investigate the operating characteristics of our wavelet-packet HFLM, as well as the two inference procedures, in simulation and use the model to analyze data on the impact of lagged exposure to particulate matter finer than 2.5$mu$g on heart rate variability in a cohort of journeyman boilermakers over the course of a days shift.
The functional linear model is a popular tool to investigate the relationship between a scalar/functional response variable and a scalar/functional covariate. We generalize this model to a functional linear mixed-effects model when repeated measurements are available on multiple subjects. Each subject has an individual intercept and slope function, while shares common population intercept and slope function. This model is flexible in the sense of allowing the slope random effects to change with the time. We propose a penalized spline smoothing method to estimate the population and random slope functions. A REML-based EM algorithm is developed to estimate the variance parameters for the random effects and the data noise. Simulation studies show that our estimation method provides an accurate estimate for the functional linear mixed-effects model with the finite samples. The functional linear mixed-effects model is demonstrated by investigating the effect of the 24-hour nitrogen dioxide on the daily maximum ozone concentrations and also studying the effect of the daily temperature on the annual precipitation.
We develop a unified approach to hypothesis testing for various types of widely used functional linear models, such as scalar-on-function, function-on-function and function-on-scalar models. In addition, the proposed test applies to models of mixed types, such as models with both functional and scalar predictors. In contrast with most existing methods that rest on the large-sample distributions of test statistics, the proposed method leverages the technique of bootstrapping max statistics and exploits the variance decay property that is an inherent feature of functional data, to improve the empirical power of tests especially when the sample size is limited and the signal is relatively weak. Theoretical guarantees on the validity and consistency of the proposed test are provided uniformly for a class of test statistics.
Linear Mixed Effects (LME) models have been widely applied in clustered data analysis in many areas including marketing research, clinical trials, and biomedical studies. Inference can be conducted using maximum likelihood approach if assuming Normal distributions on the random effects. However, in many applications of economy, business and medicine, it is often essential to impose constraints on the regression parameters after taking their real-world interpretations into account. Therefore, in this paper we extend the classical (unconstrained) LME models to allow for sign constraints on its overall coefficients. We propose to assume a symmetric doubly truncated Normal (SDTN) distribution on the random effects instead of the unconstrained Normal distribution which is often found in classical literature. With the aforementioned change, difficulty has dramatically increased as the exact distribution of the dependent variable becomes analytically intractable. We then develop likelihood-based approaches to estimate the unknown model parameters utilizing the approximation of its exact distribution. Simulation studies have shown that the proposed constrained model not only improves real-world interpretations of results, but also achieves satisfactory performance on model fits as compared to the existing model.