New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Perturbation and scaled Cooks distance

144 0 0.0 ( 0 )

Download Cite

Added by Hongtu Zhu

Publication date 2012

fields Mathematical Statistics

and research's language is English

Authors Hongtu Zhu - Joseph G. Ibrahim - Hyunsoon Cho

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Cooks distance [Technometrics 19 (1977) 15-18] is one of the most important diagnostic tools for detecting influential individual or subsets of observations in linear regression for cross-sectional data. However, for many complex data structures (e.g., longitudinal data), no rigorous approach has been developed to address a fundamental issue: deleting subsets with different numbers of observations introduces different degrees of perturbation to the current model fitted to the data, and the magnitude of Cooks distance is associated with the degree of the perturbation. The aim of this paper is to address this issue in general parametric models with complex data structures. We propose a new quantity for measuring the degree of the perturbation introduced by deleting a subset. We use stochastic ordering to quantify the stochastic relationship between the degree of the perturbation and the magnitude of Cooks distance. We develop several scaled Cooks distances to resolve the comparison of Cooks distance for different subset deletions. Theoretical and numerical examples are examined to highlight the broad spectrum of applications of these scaled Cooks distances in a formal influence analysis.

rate research

Distributional Consistency of Lasso by Perturbation Bootstrap

157 - Debraj Das , S. N. Lahiri 2017

Least Absolute Shrinkage and Selection Operator or the Lasso, introduced by Tibshirani (1996), is a popular estimation procedure in multiple linear regression when underlying design has a sparse structure, because of its property that it sets some regression coefficients exactly equal to 0. In this article, we develop a perturbation bootstrap method and establish its validity in approximating the distribution of the Lasso in heteroscedastic linear regression. We allow the underlying covariates to be either random or non-random. We show that the proposed bootstrap method works irrespective of the nature of the covariates, unlike the resample-based bootstrap of Freedman (1981) which must be tailored based on the nature (random vs non-random) of the covariates. Simulation study also justifies our method in finite samples.

Methodology Statistics Theory Statistics Theory

Transportation Distance and the Central Limit Theorem

129 - S.Ekisheva , C. Houdre 2006

For probability measures on a complete separable metric space, we present sufficient conditions for the existence of a solution to the Kantorovich transportation problem. We also obtain sufficient conditions (which sometimes also become necessary) for the convergence, in transportation, of probability measures when the cost function is continuous, non-decreasing and depends on the distance. As an application, the CLT in the transportation distance is proved for independent and some dependent stationary sequences.

Probability Statistics Theory Statistics Theory

Causal Inference with Invalid Instruments: Post-selection Problems and A Solution Using Searching and Sampling

248 - Zijian Guo 2021

Instrumental variable methods are among the most commonly used causal inference approaches to account for unmeasured confounders in observational studies. The presence of invalid instruments is a major concern for practical applications and a fast-growing area of research is inference for the causal effect with possibly invalid instruments. The existing inference methods rely on correctly separating valid and invalid instruments in a data dependent way. In this paper, we illustrate post-selection problems of these existing methods. We construct uniformly valid confidence intervals for the causal effect, which are robust to the mistakes in separating valid and invalid instruments. Our proposal is to search for the causal effect such that a sufficient amount of candidate instruments can be taken as valid. We further devise a novel sampling method, which, together with searching, lead to a more precise confidence interval. Our proposed searching and sampling confidence intervals are shown to be uniformly valid under the finite-sample majority and plurality rules. We compare our proposed methods with existing inference methods over a large set of simulation studies and apply them to study the effect of the triglyceride level on the glucose level over a mouse data set.

Methodology Statistics Theory Statistics Theory

Quantifying and Detecting Individual Level `Always Survivor Causal Effects Under `Truncation by Death and Censoring Through Time

141 - Jaffer M. Zaidi , Eric J. Tchetgen Tchetgen , 2019

The analysis of causal effects when the outcome of interest is possibly truncated by death has a long history in statistics and causal inference. The survivor average causal effect is commonly identified with more assumptions than those guaranteed by the design of a randomized clinical trial or using sensitivity analysis. This paper demonstrates that individual level causal effects in the `always survivor principal stratum can be identified with no stronger identification assumptions than randomization. We illustrate the practical utility of our methods using data from a clinical trial on patients with prostate cancer. Our methodology is the first and, as of yet, only proposed procedure that enables detecting causal effects in the presence of truncation by death using only the assumptions that are guaranteed by design of the clinical trial. This methodology is applicable to all types of outcomes.

Methodology Statistics Theory Statistics Theory

Unfolding-Model-Based Visualization: Theory, Method and Applications

115 - Yunxiao Chen , Zhiliang Ying , Haoran Zhang 2020

Multidimensional unfolding methods are widely used for visualizing item response data. Such methods project respondents and items simultaneously onto a low-dimensional Euclidian space, in which respondents and items are represented by ideal points, with person-person, item-item, and person-item similarities being captured by the Euclidian distances between the points. In this paper, we study the visualization of multidimensional unfolding from a statistical perspective. We cast multidimensional unfolding into an estimation problem, where the respondent and item ideal points are treated as parameters to be estimated. An estimator is then proposed for the simultaneous estimation of these parameters. Asymptotic theory is provided for the recovery of the ideal points, shedding lights on the validity of model-based visualization. An alternating projected gradient descent algorithm is proposed for the parameter estimation. We provide two illustrative examples, one on users movie rating and the other on senate roll call voting.

Methodology Statistics Theory Statistics Theory

comments

Fetching comments

Aِl-Baath University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Perturbation and scaled Cooks distance

Ask ChatGPT about the research

No Arabic abstract

Read More