ترغب بنشر مسار تعليمي؟ اضغط هنا

Protein Structure Parameterization via Mobius Distributions on the Torus

111   0   0.0 ( 0 )
 نشر من قبل Mohammad Arashi
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Proteins constitute a large group of macromolecules with a multitude of functions for all living organisms. Proteins achieve this by adopting distinct three-dimensional structures encoded by the sequence of their constituent amino acids in one or more polypeptides. In this paper, the statistical modelling of the protein backbone torsion angles is considered. Two new distributions are proposed for toroidal data by applying the Mobius transformation to the bivariate von Mises distribution. Marginal and conditional distributions in addition to sine-skew



قيم البحث

اقرأ أيضاً

Metabolic heterogeneity is widely recognised as the next challenge in our understanding of non-genetic variation. A growing body of evidence suggests that metabolic heterogeneity may result from the inherent stochasticity of intracellular events. How ever, metabolism has been traditionally viewed as a purely deterministic process, on the basis that highly abundant metabolites tend to filter out stochastic phenomena. Here we bridge this gap with a general method for prediction of metabolite distributions across single cells. By exploiting the separation of time scales between enzyme expression and enzyme kinetics, our method produces estimates for metabolite distributions without the lengthy stochastic simulations that would be typically required for large metabolic models. The metabolite distributions take the form of Gaussian mixture models that are directly computable from single-cell expression data and standard deterministic models for metabolic pathways. The proposed mixture models provide a systematic method to predict the impact of biochemical parameters on metabolite distributions. Our method lays the groundwork for identifying the molecular processes that shape metabolic heterogeneity and its functional implications in disease.
Inferring the structural properties of a protein from its amino acid sequence is a challenging yet important problem in biology. Structures are not known for the vast majority of protein sequences, but structure is critical for understanding function . Existing approaches for detecting structural similarity between proteins from sequence are unable to recognize and exploit structural patterns when sequences have diverged too far, limiting our ability to transfer knowledge between structurally related proteins. We newly approach this problem through the lens of representation learning. We introduce a framework that maps any protein sequence to a sequence of vector embeddings --- one per amino acid position --- that encode structural information. We train bidirectional long short-term memory (LSTM) models on protein sequences with a two-part feedback mechanism that incorporates information from (i) global structural similarity between proteins and (ii) pairwise residue contact maps for individual proteins. To enable learning from structural similarity information, we define a novel similarity measure between arbitrary-length sequences of vector embeddings based on a soft symmetric alignment (SSA) between them. Our method is able to learn useful position-specific embeddings despite lacking direct observations of position-level correspondence between sequences. We show empirically that our multi-task framework outperforms other sequence-based methods and even a top-performing structure-based alignment method when predicting structural similarity, our goal. Finally, we demonstrate that our learned embeddings can be transferred to other protein sequence problems, improving the state-of-the-art in transmembrane domain prediction.
As the infection of 2019-nCoV coronavirus is quickly developing into a global pneumonia epidemic, careful analysis of its transmission and cellular mechanisms is sorely needed. In this report, we re-analyzed the computational approaches and findings presented in two recent manuscripts by Ji et al. (https://doi.org/10.1002/jmv.25682) and by Pradhan et al. (https://doi.org/10.1101/2020.01.30.927871), which concluded that snakes are the intermediate hosts of 2019-nCoV and that the 2019-nCoV spike protein insertions shared a unique similarity to HIV-1. Results from our re-implementation of the analyses, built on larger-scale datasets using state-of-the-art bioinformatics methods and databases, do not support the conclusions proposed by these manuscripts. Based on our analyses and existing data of coronaviruses, we concluded that the intermediate hosts of 2019-nCoV are more likely to be mammals and birds than snakes, and that the novel insertions observed in the spike protein are naturally evolved from bat coronaviruses.
Proteins are an important class of biomolecules that serve as essential building blocks of the cells. Their three-dimensional structures are responsible for their functions. In this thesis we have investigated the protein structures using a network t heoretical approach. While doing so we used a coarse-grained method, viz., complex network analysis. We model protein structures at two length scales as Protein Contact Networks (PCN) and as Long-range Interaction Networks (LINs). We found that proteins by virtue of being characterised by high amount of clustering, are small-world networks. Apart from the small-world nature, we found that proteins have another general property, viz., assortativity. This is an interesting and exceptional finding as all other complex networks (except for social networks) are known to be disassortative. Importantly, we could identify one of the major topological determinant of assortativity by building appropriate controls.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا