Sensitivity indices when the inputs of a model are not independent are estimated by local polynomial techniques. Two original estimators based on local polynomial smoothers are proposed. Both have good theoretical properties which are exhibited and also illustrated through analytical examples. They are used to carry out a sensitivity analysis on a real case of a kinetic model with correlated parameters.
Uncertainties exist in both physics-based and data-driven models. Variance-based sensitivity analysis characterizes how the variance of a model output is propagated from the model inputs. The Sobol index is one of the most widely used sensitivity indices for models with independent inputs. For models with dependent inputs, different approaches have been explored to obtain sensitivity indices in the literature. Typical approaches are based on procedures of transforming the dependent inputs into independent inputs. However, such transformation requires additional information about the inputs, such as the dependency structure or the conditional probability density functions. In this paper, data-driven sensitivity indices are proposed for models with dependent inputs. We first construct ordered partitions of linearly independent polynomials of the inputs. The modified Gram-Schmidt algorithm is then applied to the ordered partitions to generate orthogonal polynomials with respect to the empirical measure based on observed data of model inputs and outputs. Using the polynomial chaos expansion with the orthogonal polynomials, we obtain the proposed data-driven sensitivity indices. The sensitivity indices provide intuitive interpretations of how the dependent inputs affect the variance of the output without a priori knowledge on the dependence structure of the inputs. Three numerical examples are used to validate the proposed approach.
Statistical models with latent structure have a history going back to the 1950s and have seen widespread use in the social sciences and, more recently, in computational biology and in machine learning. Here we study the basic latent class model proposed originally by the sociologist Paul F. Lazarfeld for categorical variables, and we explain its geometric structure. We draw parallels between the statistical and geometric properties of latent class models and we illustrate geometrically the causes of many problems associated with maximum likelihood estimation and related statistical inference. In particular, we focus on issues of non-identifiability and determination of the model dimension, of maximization of the likelihood function and on the effect of symmetric data. We illustrate these phenomena with a variety of synthetic and real-life tables, of different dimension and complexity. Much of the motivation for this work stems from the 100 Swiss Francs problem, which we introduce and describe in detail.
We derive Laplace-approximated maximum likelihood estimators (GLAMLEs) of parameters in our Graph Generalized Linear Latent Variable Models. Then, we study the statistical properties of GLAMLEs when the number of nodes $n_V$ and the observed times of a graph denoted by $K$ diverge to infinity. Finally, we display the estimation results in a Monte Carlo simulation considering different numbers of latent variables. Besides, we make a comparison between Laplace and variational approximations for inference of our model.
In this paper we derive locally D-optimal designs for discrete choice experiments based on multinomial probit models. These models include several discrete explanatory variables as well as a quantitative one. The commonly used multinomial logit model assumes independent utilities for different choice options. Thus, D-optimal optimal designs for such multinomial logit models may comprise choice sets, e.g., consisting of alternatives which are identical in all discrete attributes but different in the quantitative variable. Obviously such designs are not appropriate for many empirical choice experiments. It will be shown that locally D-optimal designs for multinomial probit models supposing independent utilities consist of counterintuitive choice sets as well. However, locally D-optimal designs for multinomial probit models allowing for dependent utilities turn out to be reasonable for analyzing decisions using discrete choice studies.
We propose a new method for changepoint estimation in partially-observed, high-dimensional time series that undergo a simultaneous change in mean in a sparse subset of coordinates. Our first methodological contribution is to introduce a MissCUSUM transformation (a generalisation of the popular Cumulative Sum statistics), that captures the interaction between the signal strength and the level of missingness in each coordinate. In order to borrow strength across the coordinates, we propose to project these MissCUSUM statistics along a direction found as the solution to a penalised optimisation problem tailored to the specific sparsity structure. The changepoint can then be estimated as the location of the peak of the absolute value of the projected univariate series. In a model that allows different missingness probabilities in different component series, we identify that the key interaction between the missingness and the signal is a weighted sum of squares of the signal change in each coordinate, with weights given by the observation probabilities. More specifically, we prove that the angle between the estimated and oracle projection directions, as well as the changepoint location error, are controlled with high probability by the sum of two terms, both involving this weighted sum of squares, and representing the error incurred due to noise and the error due to missingness respectively. A lower bound confirms that our changepoint estimator, which we call MissInspect, is optimal up to a logarithmic factor. The striking effectiveness of the MissInspect methodology is further demonstrated both on simulated data, and on an oceanographic data set covering the Neogene period.