Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Combining Survival Trials Using Aggregate Data Based on Misspecified Models

501 0 0.0 ( 0 )

Download Cite

Added by Tinghui Yu

Publication date 2015

fields Mathematical Statistics

and research's language is English

Authors Tinghui Yu

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The treatment effects of the same therapy observed from multiple clinical trials can often be very different. Yet the patient characteristics accounting for these differences may not be identifiable in real world practice. There needs to be an unbiased way to combine the results from multiple trials and report the overall treatment effect for the general population during the development and validation of a new therapy. The non-linear structure of the maximum partial likelihood estimates for the (log) hazard ratio defined with a Cox proportional hazard model leads to challenges in the statistical analyses for combining such clinical trials. In this paper, we formulated the expected overall treatment effects using various modeling assumptions. Thus we are proposing efficient estimates and a version of Wald test for the combined hazard ratio using only aggregate data. Interpretation of the methods are provided in the framework of robust data analyses involving misspecified models.

rate research

Gaussian process regression for survival data with competing risks

513 - James E. Barrett , Anthony C. C. Coolen 2013

We apply Gaussian process (GP) regression, which provides a powerful non-parametric probabilistic method of relating inputs to outputs, to survival data consisting of time-to-event and covariate measurements. In this context, the covariates are regarded as the `inputs and the event times are the `outputs. This allows for highly flexible inference of non-linear relationships between covariates and event times. Many existing methods, such as the ubiquitous Cox proportional hazards model, focus primarily on the hazard rate which is typically assumed to take some parametric or semi-parametric form. Our proposed model belongs to the class of accelerated failure time models where we focus on directly characterising the relationship between covariates and event times without any explicit assumptions on what form the hazard rates take. It is straightforward to include various types and combinations of censored and truncated observations. We apply our approach to both simulated and experimental data. We then apply multiple output GP regression, which can handle multiple potentially correlated outputs for each input, to competing risks survival data where multiple event types can occur. By tuning one of the model parameters we can control the extent to which the multiple outputs (the time-to-event for each risk) are dependent thus allowing the specification of correlated risks. Simulation studies suggest that in some cases assuming dependence can lead to more accurate predictions.

Statistics Theory Methodology Statistics Theory

Dynamics of Bayesian Updating with Dependent Data and Misspecified Models

792 - Cosma Rohilla Shalizi 2009

Much is now known about the consistency of Bayesian updating on infinite-dimensional parameter spaces with independent or Markovian data. Necessary conditions for consistency include the prior putting enough weight on the correct neighborhoods of the data-generating distribution; various sufficient conditions further restrict the prior in ways analogous to capacity control in frequentist nonparametrics. The asymptotics of Bayesian updating with mis-specified models or priors, or non-Markovian data, are far less well explored. Here I establish sufficient conditions for posterior convergence when all hypotheses are wrong, and the data have complex dependencies. The main dynamical assumption is the asymptotic equipartition (Shannon-McMillan-Breiman) property of information theory. This, along with Egorovs Theorem on uniform convergence, lets me build a sieve-like structure for the prior. The main statistical assumption, also a form of capacity control, concerns the compatibility of the prior and the data-generating process, controlling the fluctuations in the log-likelihood when averaged over the sieve-like sets. In addition to posterior convergence, I derive a kind of large deviations principle for the posterior measure, extending in some cases to rates of convergence, and discuss the advantages of predicting using a combination of models known to be wrong. An appendix sketches connections between these results and the replicator dynamics of evolutionary theory.

Statistics Theory Populations and Evolution Statistics Theory

Covariate dimension reduction for survival data via the Gaussian process latent variable model

493 - James E. Barrett , Anthony C. C. Coolen 2014

The analysis of high dimensional survival data is challenging, primarily due to the problem of overfitting which occurs when spurious relationships are inferred from data that subsequently fail to exist in test data. Here we propose a novel method of extracting a low dimensional representation of covariates in survival data by combining the popular Gaussian Process Latent Variable Model (GPLVM) with a Weibull Proportional Hazards Model (WPHM). The combined model offers a flexible non-linear probabilistic method of detecting and extracting any intrinsic low dimensional structure from high dimensional data. By reducing the covariate dimension we aim to diminish the risk of overfitting and increase the robustness and accuracy with which we infer relationships between covariates and survival outcomes. In addition, we can simultaneously combine information from multiple data sources by expressing multiple datasets in terms of the same low dimensional space. We present results from several simulation studies that illustrate a reduction in overfitting and an increase in predictive performance, as well as successful detection of intrinsic dimensionality. We provide evidence that it is advantageous to combine dimensionality reduction with survival outcomes rather than performing unsupervised dimensionality reduction on its own. Finally, we use our model to analyse experimental gene expression data and detect and extract a low dimensional representation that allows us to distinguish high and low risk groups with superior accuracy compared to doing regression on the original high dimensional data.

Statistics Theory Methodology Statistics Theory

Adaptive Robust Large Volatility Matrix Estimation Based on High-Frequency Financial Data

161 - Minseok Shin , Donggyu Kim , Jianqing Fan 2021

Several novel statistical methods have been developed to estimate large integrated volatility matrices based on high-frequency financial data. To investigate their asymptotic behaviors, they require a sub-Gaussian or finite high-order moment assumption for observed log-returns, which cannot account for the heavy tail phenomenon of stock returns. Recently, a robust estimator was developed to handle heavy-tailed distributions with some bounded fourth-moment assumption. However, we often observe that log-returns have heavier tail distribution than the finite fourth-moment and that the degrees of heaviness of tails are heterogeneous over the asset and time period. In this paper, to deal with the heterogeneous heavy-tailed distributions, we develop an adaptive robust integrated volatility estimator that employs pre-averaging and truncation schemes based on jump-diffusion processes. We call this an adaptive robust pre-averaging realized volatility (ARP) estimator. We show that the ARP estimator has a sub-Weibull tail concentration with only finite 2$alpha$-th moments for any $alpha>1$. In addition, we establish matching upper and lower bounds to show that the ARP estimation procedure is optimal. To estimate large integrated volatility matrices using the approximate factor model, the ARP estimator is further regularized using the principal orthogonal complement thresholding (POET) method. The numerical study is conducted to check the finite sample performance of the ARP estimator.

Statistics Theory Methodology Statistics Theory

Information-adaptive clinical trials with selective recruitment and binary outcomes

164 - James E. Barrett 2015

Selective recruitment designs preferentially recruit individuals that are estimated to be statistically informative onto a clinical trial. Individuals that are expected to contribute less information have a lower probability of recruitment. Furthermore, in an information-adaptive design recruits are allocated to treatment arms in a manner that maximises information gain. The informativeness of an individual depends on their covariate (or biomarker) values and how information is defined is a critical element of information-adaptive designs. In this paper we define and evaluate four different methods for quantifying statistical information. Using both experimental data and numerical simulations we show that selective recruitment designs can offer a substantial increase in statistical power compared to randomised designs. In trials without selective recruitment we find that allocating individuals to treatment arms according to information-adaptive protocols also leads to an increase in statistical power. Consequently, selective recruitment designs can potentially achieve successful trials using fewer recruits thereby offering economic and ethical advantages.

Statistics Theory Methodology Statistics Theory

comments

Fetching comments

International University for Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Combining Survival Trials Using Aggregate Data Based on Misspecified Models

Ask ChatGPT about the research

No Arabic abstract

Read More