No Arabic abstract
The functional linear model is a popular tool to investigate the relationship between a scalar/functional response variable and a scalar/functional covariate. We generalize this model to a functional linear mixed-effects model when repeated measurements are available on multiple subjects. Each subject has an individual intercept and slope function, while shares common population intercept and slope function. This model is flexible in the sense of allowing the slope random effects to change with the time. We propose a penalized spline smoothing method to estimate the population and random slope functions. A REML-based EM algorithm is developed to estimate the variance parameters for the random effects and the data noise. Simulation studies show that our estimation method provides an accurate estimate for the functional linear mixed-effects model with the finite samples. The functional linear mixed-effects model is demonstrated by investigating the effect of the 24-hour nitrogen dioxide on the daily maximum ozone concentrations and also studying the effect of the daily temperature on the annual precipitation.
Linear Mixed Effects (LME) models have been widely applied in clustered data analysis in many areas including marketing research, clinical trials, and biomedical studies. Inference can be conducted using maximum likelihood approach if assuming Normal distributions on the random effects. However, in many applications of economy, business and medicine, it is often essential to impose constraints on the regression parameters after taking their real-world interpretations into account. Therefore, in this paper we extend the classical (unconstrained) LME models to allow for sign constraints on its overall coefficients. We propose to assume a symmetric doubly truncated Normal (SDTN) distribution on the random effects instead of the unconstrained Normal distribution which is often found in classical literature. With the aforementioned change, difficulty has dramatically increased as the exact distribution of the dependent variable becomes analytically intractable. We then develop likelihood-based approaches to estimate the unknown model parameters utilizing the approximation of its exact distribution. Simulation studies have shown that the proposed constrained model not only improves real-world interpretations of results, but also achieves satisfactory performance on model fits as compared to the existing model.
We study a scalar-on-function historical linear regression model which assumes that the functional predictor does not influence the response when the time passes a certain cutoff point. We approach this problem from the perspective of locally sparse modeling, where a function is locally sparse if it is zero on a substantial portion of its defining domain. In the historical linear model, the slope function is exactly a locally sparse function that is zero beyond the cutoff time. A locally sparse estimate then gives rise to an estimate of the cutoff time. We propose a nested group bridge penalty that is able to specifically shrink the tail of a function. Combined with the B-spline basis expansion and penalized least squares, the nested group bridge approach can identify the cutoff time and produce a smooth estimate of the slope function simultaneously. The proposed locally sparse estimator is shown to be consistent, while its numerical performance is illustrated by simulation studies. The proposed method is demonstrated with an application of determining the effect of the past engine acceleration on the current particulate matter emission.
We study a functional linear regression model that deals with functional responses and allows for both functional covariates and high-dimensional vector covariates. The proposed model is flexible and nests several functional regression models in the literature as special cases. Based on the theory of reproducing kernel Hilbert spaces (RKHS), we propose a penalized least squares estimator that can accommodate functional variables observed on discrete sample points. Besides a conventional smoothness penalty, a group Lasso-type penalty is further imposed to induce sparsity in the high-dimensional vector predictors. We derive finite sample theoretical guarantees and show that the excess prediction risk of our estimator is minimax optimal. Furthermore, our analysis reveals an interesting phase transition phenomenon that the optimal excess risk is determined jointly by the smoothness and the sparsity of the functional regression coefficients. A novel efficient optimization algorithm based on iterative coordinate descent is devised to handle the smoothness and group penalties simultaneously. Simulation studies and real data applications illustrate the promising performance of the proposed approach compared to the state-of-the-art methods in the literature.
Mixed linear regression (MLR) model is among the most exemplary statistical tools for modeling non-linear distributions using a mixture of linear models. When the additive noise in MLR model is Gaussian, Expectation-Maximization (EM) algorithm is a widely-used algorithm for maximum likelihood estimation of MLR parameters. However, when noise is non-Gaussian, the steps of EM algorithm may not have closed-form update rules, which makes EM algorithm impractical. In this work, we study the maximum likelihood estimation of the parameters of MLR model when the additive noise has non-Gaussian distribution. In particular, we consider the case that noise has Laplacian distribution and we first show that unlike the the Gaussian case, the resulting sub-problems of EM algorithm in this case does not have closed-form update rule, thus preventing us from using EM in this case. To overcome this issue, we propose a new algorithm based on combining the alternating direction method of multipliers (ADMM) with EM algorithm idea. Our numerical experiments show that our method outperforms the EM algorithm in statistical accuracy and computational time in non-Gaussian noise case.
We introduce a new approach to a linear-circular regression problem that relates multiple linear predictors to a circular response. We follow a modeling approach of a wrapped normal distribution that describes angular variables and angular distributions and advances it for a linear-circular regression analysis. Some previous works model a circular variable as projection of a bivariate Gaussian random vector on the unit square, and the statistical inference of the resulting model involves complicated sampling steps. The proposed model treats circular responses as the result of the modulo operation on unobserved linear responses. The resulting model is a mixture of multiple linear-linear regression models. We present two EM algorithms for maximum likelihood estimation of the mixture model, one for a parametric model and another for a non-parametric model. The estimation algorithms provide a great trade-off between computation and estimation accuracy, which was numerically shown using five numerical examples. The proposed approach was applied to a problem of estimating wind directions that typically exhibit complex patterns with large variation and circularity.