ترغب بنشر مسار تعليمي؟ اضغط هنا

Discussion of Calibrated Bayes, for Statistics in General, and Missing Data in Particular by R. J. A. Little

56   0   0.0 ( 0 )
 نشر من قبل Michael D. Larsen
 تاريخ النشر 2011
  مجال البحث الاحصاء الرياضي
والبحث باللغة English
 تأليف Michael D. Larsen




اسأل ChatGPT حول البحث

Discussion of Calibrated Bayes, for Statistics in General, and Missing Data in Particular by R. Little [arXiv:1108.1917]


قيم البحث

اقرأ أيضاً

Discussion of Estimating Random Effects via Adjustment for Density Maximization by C. Morris and R. Tang [arXiv:1108.3234]
Missing values are unavoidable when working with data. Their occurrence is exacerbated as more data from different sources become available. However, most statistical models and visualization methods require complete data, and improper handling of mi ssing data results in information loss or biased analyses. Since the seminal work of Rubin (1976), a burgeoning literature on missing values has arisen, with heterogeneous aims and motivations. This led to the development of various methods, formalizations, and tools. For practitioners, it remains nevertheless challenging to decide which method is most suited for their problem, partially due to a lack of systematic covering of this topic in statistics or data science curricula. To help address this challenge, we have launched the R-miss-tastic platform, which aims to provide an overview of standard missing values problems, methods, and relevant implementations of methodologies. Beyond gathering and organizing a large majority of the material on missing data (bibliography, courses, tutorials, implementations), R-miss-tastic covers the development of standardized analysis workflows. Indeed, we have developed several pipelines in R and Python to allow for hands-on illustration of and recommendations on missing values handling in various statistical tasks such as matrix completion, estimation and prediction, while ensuring reproducibility of the analyses. Finally, the platform is dedicated to users who analyze incomplete data, researchers who want to compare their methods and search for an up-to-date bibliography, and also teachers who are looking for didactic materials (notebooks, video, slides).
Missing data are a common problem for both the construction and implementation of a prediction algorithm. Pattern mixture kernel submodels (PMKS) - a series of submodels for every missing data pattern that are fit using only data from that pattern - are a computationally efficient remedy for both stages. Here we show that PMKS yield the most predictive algorithm among all standard missing data strategies. Specifically, we show that the expected loss of a forecasting algorithm is minimized when each pattern-specific loss is minimized. Simulations and a re-analysis of the SUPPORT study confirms that PMKS generally outperforms zero-imputation, mean-imputation, complete-case analysis, complete-case submodels, and even multiple imputation (MI). The degree of improvement is highly dependent on the missingness mechanism and the effect size of missing predictors. When the data are Missing at Random (MAR) MI can yield comparable forecasting performance but generally requires a larger computational cost. We see that predictions from the PMKS are equivalent to the limiting predictions for a MI procedure that uses a mean model dependent on missingness indicators (the MIMI model). Consequently, the MIMI model can be used to assess the MAR assumption in practice. The focus of this paper is on out-of-sample prediction behavior, implications for model inference are only briefly explored.
Rank data arises frequently in marketing, finance, organizational behavior, and psychology. Most analysis of rank data reported in the literature assumes the presence of one or more variables (sometimes latent) based on whose values the items are ran ked. In this paper we analyze rank data using a purely probabilistic model where the observed ranks are assumed to be perturbe
The ICH E9 addendum introduces the term intercurrent event to refer to events that happen after randomisation and that can either preclude observation of the outcome of interest or affect its interpretation. It proposes five strategies for handling i ntercurrent events to form an estimand but does not suggest statistical methods for estimation. In this paper we focus on the hypothetical strategy, where the treatment effect is defined under the hypothetical scenario in which the intercurrent event is prevented. For its estimation, we consider causal inference and missing data methods. We establish that certain causal inference estimators are identical to certain missing data estimators. These links may help those familiar with one set of methods but not the other. Moreover, using potential outcome notation allows us to state more clearly the assumptions on which missing data methods rely to estimate hypothetical estimands. This helps to indicate whether estimating a hypothetical estimand is reasonable, and what data should be used in the analysis. We show that hypothetical estimands can be estimated by exploiting data after intercurrent event occurrence, which is typically not used. We also present Monte Carlo simulations that illustrate the implementation and performance of the methods in different settings.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا