No Arabic abstract
We investigate R-optimal designs for multi-response regression models with multi-factors, where the random errors in these models are correlated. Several theoretical results are derived for Roptimal designs, including scale invariance, reflection symmetry, line and plane symmetry, and dependence on the covariance matrix of the errors. All the results can be applied to linear and nonlinear models. In addition, an efficient algorithm based on an interior point method is developed for finding R-optimal designs on discrete design spaces. The algorithm is very flexible, and can be applied to any multi-response regression model.
Optimal two-treatment, $p$ period crossover designs for binary responses are determined. The optimal designs are obtained by minimizing the variance of the treatment contrast estimator over all possible allocations of $n$ subjects to $2^p$ possible treatment sequences. An appropriate logistic regression model is postulated and the within subject covariances are modeled through a working correlation matrix. The marginal mean of the binary responses are fitted using generalized estimating equations. The efficiencies of some crossover designs for $p=2,3,4$ periods are calculated. The effect of misspecified working correlation matrix on design efficiency is also studied.
The issue of determining not only an adequate dose but also a dosing frequency of a drug arises frequently in Phase II clinical trials. This results in the comparison of models which have some parameters in common. Planning such studies based on Bayesian optimal designs offers robustness to our conclusions since these designs, unlike locally optimal designs, are efficient even if the parameters are misspecified. In this paper we develop approximate design theory for Bayesian $D$-optimality for nonlinear regression models with common parameters and investigate the cases of common location or common location and scale parameters separately. Analytical characterisations of saturated Bayesian $D$-optimal designs are derived for frequently used dose-response models and the advantages of our results are illustrated via a numerical investigation.
Multi-task learning is increasingly used to investigate the association structure between multiple responses and a single set of predictor variables in many applications. In the era of big data, the coexistence of incomplete outcomes, large number of responses, and high dimensionality in predictors poses unprecedented challenges in estimation, prediction, and computation. In this paper, we propose a scalable and computationally efficient procedure, called PEER, for large-scale multi-response regression with incomplete outcomes, where both the numbers of responses and predictors can be high-dimensional. Motivated by sparse factor regression, we convert the multi-response regression into a set of univariate-response regressions, which can be efficiently implemented in parallel. Under some mild regularity conditions, we show that PEER enjoys nice sampling properties including consistency in estimation, prediction, and variable selection. Extensive simulation studies show that our proposal compares favorably with several existing methods in estimation accuracy, variable selection, and computation efficiency.
Corrupted data sets containing noisy or missing observations are prevalent in various contemporary applications such as economics, finance and bioinformatics. Despite the recent methodological and algorithmic advances in high-dimensional multi-response regression, how to achieve scalable and interpretable estimation under contaminated covariates is unclear. In this paper, we develop a new methodology called convex conditioned sequential sparse learning (COSS) for error-in-variables multi-response regression under both additive measurement errors and random missing data. It combines the strengths of the recently developed sequential sparse factor regression and the nearest positive semi-definite matrix projection, thus enjoying stepwise convexity and scalability in large-scale association analyses. Comprehensive theoretical guarantees are provided and we demonstrate the effectiveness of the proposed methodology through numerical studies.
The aim of this paper is to present a mixture composite regression model for claim severity modelling. Claim severity modelling poses several challenges such as multimodality, heavy-tailedness and systematic effects in data. We tackle this modelling problem by studying a mixture composite regression model for simultaneous modeling of attritional and large claims, and for considering systematic effects in both the mixture components as well as the mixing probabilities. For model fitting, we present a group-fused regularization approach that allows us for selecting the explanatory variables which significantly impact the mixing probabilities and the different mixture components, respectively. We develop an asymptotic theory for this regularized estimation approach, and fitting is performed using a novel Generalized Expectation-Maximization algorithm. We exemplify our approach on real motor insurance data set.