ﻻ يوجد ملخص باللغة العربية
The amino acid sequences of proteins provide rich information for inferring distant phylogenetic relationships and for predicting protein functions. Estimating the rate matrix of residue substitutions from amino acid sequences is also important because the rate matrix can be used to develop scoring matrices for sequence alignment. Here we use a continuous time Markov process to model the substitution rates of residues and develop a Bayesian Markov chain Monte Carlo method for rate estimation. We validate our method using simulated artificial protein sequences. Because different local regions such as binding surfaces and the protein interior core experience different selection pressures due to functional or stability constraints, we use our method to estimate the substitution rates of local regions. Our results show that the substitution rates are very different for residues in the buried core and residues on the solvent exposed surfaces. In addition, the rest of the proteins on the binding surfaces also have very different substitution rates from residues. Based on these findings, we further develop a method for protein function prediction by surface matching using scoring matrices derived from estimated substitution rates for residues located on the binding surfaces. We show with examples that our method is effective in identifying functionally related proteins that have overall low sequence identity, a task known to be very challenging.
In this work, we developed an efficient approach to compute ensemble averages in systems with pairwise-additive energetic interactions between the entities. Methods involving full enumeration of the configuration space result in exponential complexit
The twenty protein coding amino acids are found in proteomes with different relative abundances. The most abundant amino acid, leucine, is nearly an order of magnitude more prevalent than the least abundant amino acid, cysteine. Amino acid metabolic
The correlations of primary and secondary structures were analyzed using proteins with known structure from Protein Data Bank. The correlation values of amino acid type and the eight secondary structure types at distant position were calculated for d
Surface-enhanced Raman spectroscopy (SERS) is a sensitive label-free optical method that can provide fingerprint Raman spectra of biomolecules such as DNA, amino acids and proteins. While SERS of single DNA molecule has been recently demonstrated, Ra
A deep neural network based architecture was constructed to predict amino acid side chain conformation with unprecedented accuracy. Amino acid side chain conformation prediction is essential for protein homology modeling and protein design. Current w