ﻻ يوجد ملخص باللغة العربية
Chromosome conformation capture experiments such as Hi-C are used to map the three-dimensional spatial organization of genomes. One specific feature of the 3D organization is known as topologically associating domains (TADs), which are densely interacting, contiguous chromatin regions playing important roles in regulating gene expression. A few algorithms have been proposed to detect TADs. In particular, the structure of Hi-C data naturally inspires application of community detection methods. However, one of the drawbacks of community detection is that most methods take exchangeability of the nodes in the network for granted; whereas the nodes in this case, i.e. the positions on the chromosomes, are not exchangeable. We propose a network model for detecting TADs using Hi-C data that takes into account this non-exchangeability. In addition, our model explicitly makes use of cell-type specific CTCF binding sites as biological covariates and can be used to identify conserved TADs across multiple cell types. The model leads to a likelihood objective that can be efficiently optimized via relaxation. We also prove that when suitably initialized, this model finds the underlying TAD structure with high probability. Using simulated data, we show the advantages of our method and the caveats of popular community detection methods, such as spectral clustering, in this application. Applying our method to real Hi-C data, we demonstrate the domains identified have desirable epigenetic features and compare them across different cell types.
Recent chromosome conformation capture experiments have led to the discovery of dense, contiguous, megabase-sized topological domains that are similar across cell types and conserved across species. These domains are strongly correlated with a number
In this paper we propose network methodology to infer prognostic cancer biomarkers based on the epigenetic pattern DNA methylation. Epigenetic processes such as DNA methylation reflect environmental risk factors, and are increasingly recognised for t
Alzheimers disease is the most common cause of dementia. It is the fifth-leading cause of death among elderly people. With high genetic heritability (79%), finding disease causal genes is a crucial step in find treatment for AD. Following the Interna
Recent technology breakthrough in spatial molecular profiling has enabled the comprehensive molecular characterizations of single cells while preserving spatial information. It provides new opportunities to delineate how cells from different origins
When dealing with large scale gene expression studies, observations are commonly contaminated by unwanted variation factors such as platforms or batches. Not taking this unwanted variation into account when analyzing the data can lead to spurious ass