ﻻ يوجد ملخص باللغة العربية
Predicting DNA-protein binding is an important and classic problem in bioinformatics. Convolutional neural networks have outperformed conventional methods in modeling the sequence specificity of DNA-protein binding. However, none of the studies has utilized graph convolutional networks for motif inference. In this work, we propose to use graph convolutional networks for motif inference. We build a sequence k-mer graph for the whole dataset based on k-mer co-occurrence and k-mer sequence relationship and then learn DNA Graph Convolutional Network (DNA-GCN) for the whole dataset. Our DNA-GCN is initialized with a one-hot representation for all nodes, and it then jointly learns the embeddings for both k-mers and sequences, as supervised by the known labels of sequences. We evaluate our model on 50 datasets from ENCODE. DNA-GCN shows its competitive performance compared with the baseline model. Besides, we analyze our model and design several different architectures to help fit different datasets.
Background: Recent assays for individual-specific genome-wide DNA methylation profiles have enabled epigenome-wide association studies to identify specific CpG sites associated with a phenotype. Computational prediction of CpG site-specific methylati
Empirical scoring functions based on either molecular force fields or cheminformatics descriptors are widely used, in conjunction with molecular docking, during the early stages of drug discovery to predict potency and binding affinity of a drug-like
Drug discovery often relies on the successful prediction of protein-ligand binding affinity. Recent advances have shown great promise in applying graph neural networks (GNNs) for better affinity prediction by learning the representations of protein-l
We analyze the statistical properties of Poincare recurrences of Homo sapiens, mammalian and other DNA sequences taken from Ensembl Genome data base with up to fifteen billions base pairs. We show that the probability of Poincare recurrences decays i
Much of the complexity observed in gene regulation originates from cooperative protein-DNA binding. While studies of the target search of proteins for their specific binding sites on the DNA have revealed design principles for the quantitative charac