ترغب بنشر مسار تعليمي؟ اضغط هنا

Prediction of genomic properties and classification of life by protein length distributions

214   0   0.0 ( 0 )
 نشر من قبل Dirson Jian Li
 تاريخ النشر 2008
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Much evolutionary information is stored in the fluctuations of protein length distributions. The genome size and non-coding DNA content can be calculated based only on the protein length distributions. So there is intrinsic relationship between the coding DNA size and non-coding DNA size. According to the correlations and quasi-periodicity of protein length distributions, we can classify life into three domains. Strong evidences are found to support the order in the structures of protein length distributions.



قيم البحث

اقرأ أيضاً

The classification of life should be based upon the fundamental mechanism in the evolution of life. We found that the global relationships among species should be circular phylogeny, which is quite different from the common sense based upon phylogene tic trees. The genealogical circles can be observed clearly according to the analysis of protein length distributions of contemporary species. Thus, we suggest that domains can be defined by distinguished phylogenetic circles, which are global and stable characteristics of living systems. The mechanism in genome size evolution has been clarified; hence main component questions on C-value enigma can be explained. According to the correlations and quasi-periodicity of protein length distributions, we can also classify life into three domains.
An important task of human genetics studies is to accurately predict disease risks in individuals based on genetic markers, which allows for identifying individuals at high disease risks, and facilitating their disease treatment and prevention. Altho ugh hundreds of genome-wide association studies (GWAS) have been conducted on many complex human traits in recent years, there has been only limited success in translating these GWAS data into clinically useful risk prediction models. The predictive capability of GWAS data is largely bottlenecked by the available training sample size due to the presence of numerous variants carrying only small to modest effects. Recent studies have shown that different human traits may share common genetic bases. Therefore, an attractive strategy to increase the training sample size and hence improve the prediction accuracy is to integrate data of genetically correlated phenotypes. Yet the utility of genetic correlation in risk prediction has not been explored in the literature. In this paper, we analyzed GWAS data for bipolar and related disorders (BARD) and schizophrenia (SZ) with a bivariate ridge regression method, and found that jointly predicting the two phenotypes could substantially increase prediction accuracy as measured by the AUC (area under the receiver operating characteristic curve). We also found similar prediction accuracy improvements when we jointly analyzed GWAS data for Crohns disease (CD) and ulcerative colitis (UC). The empirical observations were substantiated through our comprehensive simulation studies, suggesting that a gain in prediction accuracy can be obtained by combining phenotypes with relatively high genetic correlations. Through both real data and simulation studies, we demonstrated pleiotropy as a valuable asset that opens up a new opportunity to improve genetic risk prediction in the future.
The phenotypic consequences of individual mutations are modulated by the wild type genetic background in which they occur.Although such background dependence is widely observed, we do not know whether general patterns across species and traits exist, nor about the mechanisms underlying it. We also lack knowledge on how mutations interact with genetic background to influence gene expression, and how this in turn mediates mutant phenotypes. Furthermore, how genetic background influences patterns of epistasis remains unclear. To investigate the genetic basis and genomic consequences of genetic background dependence of the scallopedE3 allele on the Drosophila melanogaster wing, we generated multiple novel genome level datasets from a mapping by introgression experiment and a tagged RNA gene expression dataset. In addition we used whole genome re-sequencing of the parental lines two commonly used laboratory strains to predict polymorphic transcription factor binding sites for SD. We integrated these data with previously published genomic datasets from expression microarrays and a modifier mutation screen. By searching for genes showing a congruent signal across multiple datasets, we were able to identify a robust set of candidate loci contributing to the background dependent effects of mutations in sd. We also show that the majority of background-dependent modifiers previously reported are caused by higher-order epistasis, not quantitative non-complementation. These findings provide a useful foundation for more detailed investigations of genetic background dependence in this system, and this approach is likely to prove useful in exploring the genetic basis of other traits as well.
We present a combined mean-field and simulation approach to different models describing the dynamics of classes formed by elements that can appear, disappear or copy themselves. These models, related to a paradigm duplication-innovation model known a s Chinese Restaurant Process, are devised to reproduce the scaling behavior observed in the genome-wide repertoire of protein domains of all known species. In view of these data, we discuss the qualitative and quantitative differences of the alternative model formulations, focusing in particular on the roles of element loss and of the specificity of empirical domain classes.
In protein-protein interaction networks certain topological properties appear to be recurrent: networks maps are considered scale-free. It is possible that this topology is reflected in the protein structure. In this paper we investigate the role of protein disorder in the network topology. We find that the disorder of a protein (or of its neighbors) is independent of its number of protein-protein interactions. This result suggests that protein disorder does not play a role in the scale-free architecture of protein networks.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا