ترغب بنشر مسار تعليمي؟ اضغط هنا

A multi-functional analyzer uses parameter constraints to improve the efficiency of model-based gene-set analysis

43   0   0.0 ( 0 )
 نشر من قبل Zhishi Wang
 تاريخ النشر 2013
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We develop a model-based methodology for integrating gene-set information with an experimentally-derived gene list. The methodology uses a previously reported sampling model, but takes advantage of natural constraints in the high-dimensional discrete parameter space in order to work from a more structured prior distribution than is currently available. We show how the natural constraints are expressed in terms of linear inequality constraints within a set of binary latent variables. Further, the currently available prior gives low probability to these constraints in complex systems, such as Gene Ontology (GO), thus reducing the efficiency of statistical inference. We develop two computational advances to enable posterior inference within the constrained parameter space: one using integer linear programming for optimization and one using a penalized Markov chain sampler. Numerical experiments demonstrate the utility of the new methodology for a multivariate integration of genomic data with GO or related information systems. Compared to available methods, the proposed multi-functional analyzer covers more reported genes without mis-covering nonreported genes, as demonstrated on genome-wide data from association studies of type 2 diabetes and from RNA interference studies of influenza.

قيم البحث

اقرأ أيضاً

115 - Shulei Wang , Ming Yuan 2016
Motivated by gene set enrichment analysis, we investigate the problem of combined hypothesis testing on a graph. We introduce a general framework to effectively use the structural information of the underlying graph when testing multivariate means. A new testing procedure is proposed within this framework. We show that the test is optimal in that it can consistently detect departure from the collective null at a rate that no other test could improve, for almost all graphs. We also provide general performance bounds for the proposed test under any specific graph, and illustrate their utility through several common types of graphs. Numerical experiments are presented to further demonstrate the merits of our approach.
When drawing causal inference from observational data, there is always concern about unmeasured confounding. One way to tackle this is to conduct a sensitivity analysis. One widely-used sensitivity analysis framework hypothesizes the existence of a s calar unmeasured confounder U and asks how the causal conclusion would change were U measured and included in the primary analysis. Works along this line often make various parametric assumptions on U, for the sake of mathematical and computational simplicity. In this article, we substantively further this line of research by developing a valid sensitivity analysis that leaves the distribution of U unrestricted. Our semiparametric estimator has three desirable features compared to many existing methods in the literature. First, our method allows for a larger and more flexible family of models, and mitigates observable implications (Franks et al., 2019). Second, our methods work seamlessly with any primary analysis that models the outcome regression parametrically. Third, our method is easy to use and interpret. We construct both pointwise confidence intervals and confidence bands that are uniformly valid over a given sensitivity parameter space, thus formally accounting for unknown sensitivity parameters. We apply our proposed method on an influential yet controversial study of the causal relationship between war experiences and political activeness using observational data from Uganda.
Quantitative MR imaging is increasingly favoured for its richer information content and standardised measures. However, computing quantitative parameter maps, such as those encoding longitudinal relaxation rate (R1), apparent transverse relaxation ra te (R2*) or magnetisation-transfer saturation (MTsat), involves inverting a highly non-linear function. Many methods for deriving parameter maps assume perfect measurements and do not consider how noise is propagated through the estimation procedure, resulting in needlessly noisy maps. Instead, we propose a probabilistic generative (forward) model of the entire dataset, which is formulated and inverted to jointly recover (log) parameter maps with a well-defined probabilistic interpretation (e.g., maximum likelihood or maximum a posteriori). The second order optimisation we propose for model fitting achieves rapid and stable convergence thanks to a novel approximate Hessian. We demonstrate the utility of our flexible framework in the context of recovering more accurate maps from data acquired using the popular multi-parameter mapping protocol. We also show how to incorporate a joint total variation prior to further decrease the noise in the maps, noting that the probabilistic formulation allows the uncertainty on the recovered parameter maps to be estimated. Our implementation uses a PyTorch backend and benefits from GPU acceleration. It is available at https://github.com/balbasty/nitorch.
Risk evaluation to identify individuals who are at greater risk of cancer as a result of heritable pathogenic variants is a valuable component of individualized clinical management. Using principles of Mendelian genetics, Bayesian probability theory, and variant-specific knowledge, Mendelian models derive the probability of carrying a pathogenic variant and developing cancer in the future, based on family history. Existing Mendelian models are widely employed, but are generally limited to specific genes and syndromes. However, the upsurge of multi-gene panel germline testing has spurred the discovery of many new gene-cancer associations that are not presently accounted for in these models. We have developed PanelPRO, a flexible, efficient Mendelian risk prediction framework that can incorporate an arbitrary number of genes and cancers, overcoming the computational challenges that arise because of the increased model complexity. We implement an eleven-gene, eleven-cancer model, the largest Mendelian model created thus far, based on this framework. Using simulations and a clinical cohort with germline panel testing data, we evaluate model performance, validate the reverse-compatibility of our approach with existing Mendelian models, and illustrate its usage. Our implementation is freely available for research use in the PanelPRO R package.
Smart metering infrastructures collect data almost continuously in the form of fine-grained long time series. These massive time series often have common daily patterns that are repeated between similar days or seasons and shared between grouped mete rs. Within this context, we propose a method to highlight individuals with abnormal daily dependency patterns, which we term evolution outliers. To this end, we approach the problem from the standpoint of Functional Data Analysis (FDA), by treating each daily record as a function or curve. We then focus on the morphological aspects of the observed curves, such as daily magnitude, daily shape, derivatives, and inter-day evolution. The proposed method for evolution outliers relies on the concept of functional depth, which has been a cornerstone in the literature of FDA to build shape and magnitude outlier detection methods. In conjunction with our evolution outlier proposal, these methods provide an outlier detection toolbox for smart meter data that covers a wide palette of functional outliers classes. We illustrate the outlier identification ability of this toolbox using actual smart metering data corresponding to photovoltaic energy generation and circuit voltage records.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا