Structure Space of Model Proteins --A Principle Component Analysis


Abstract in English

We study the space of all compact structures on a two-dimensional square lattice of size $N=6times6$. Each structure is mapped onto a vector in $N$-dimensions according to a hydrophobic model. Previous work has shown that the designabilities of structures are closely related to the distribution of the structure vectors in the $N$-dimensional space, with highly designable structures predominantly found in low density regions. We use principal component analysis to probe and characterize the distribution of structure vectors, and find a non-uniform density with a single peak. Interestingly, the principal axes of this peak are almost aligned with Fourier eigenvectors, and the corresponding Fourier eigenvalues go to zero continuously at the wave-number for alternating patterns ($q=pi$). These observations provide a stepping stone for an analytic description of the distribution of structural points, and open the possibility of estimating designabilities of realistic structures by simply Fourier transforming the hydrophobicities of the corresponding sequences.

Download