ﻻ يوجد ملخص باللغة العربية
Association testing aims to discover the underlying relationship between genotypes (usually Single Nucleotide Polymorphisms, or SNPs) and phenotypes (attributes, or traits). The typically large data sets used in association testing often contain missing values. Standard statistical methods either impute the missing values using relatively simple assumptions, or delete them, or both, which can generate biased results. Here we describe the Bayesian hierarchical model BAMD (Bayesian Association with Missing Data). BAMD is a Gibbs sampler, in which missing values are multiply imputed based upon all of the available information in the data set. We estimate the parameters and prove that updating one SNP at each iteration preserves the ergodic property of the Markov chain, and at the same time improves computational speed. We also implement a model selection option in BAMD, which enables potential detection of SNP interactions. Simulations show that unbiased estimates of SNP effects are recovered with missing genotype data. Also, we validate associations between SNPs and a carbon isotope discrimination phenotype that were previously reported using a family based method, and discover an additional SNP associated with the trait. BAMD is available as an R-package from http://cran.r-project.org/package=BAMD
We provide a view on high-dimensional statistical inference for genome-wide association studies (GWAS). It is in part a review but covers also new developments for meta analysis with multiple studies and novel software in terms of an R-package hierin
We develop a distribution-free, unsupervised anomaly detection method called ECAD, which wraps around any regression algorithm and sequentially detects anomalies. Rooted in conformal prediction, ECAD does not require data exchangeability but approxim
Background: All-in-one station-based health monitoring devices are implemented in elder homes in Hong Kong to support the monitoring of vital signs of the elderly. During a pilot study, it was discovered that the systolic blood pressure was incorrect
The autoregressive (AR) model is a widely used model to understand time series data. Traditionally, the innovation noise of the AR is modeled as Gaussian. However, many time series applications, for example, financial time series data, are non-Gaussi
During the semiconductor manufacturing process, predicting the yield of the semiconductor is an important problem. Early detection of defective product production in the manufacturing process can save huge production cost. The data generated from the