CoHSI II; The average length of proteins, evolutionary pressure and eukaryotic fine structure


Abstract in English

The CoHSI (Conservation of Hartley-Shannon Information) distribution is at the heart of a wide-class of discrete systems, defining (amongst other properties) the length distribution of their components. Discrete systems such as the known proteome, computer software and texts are all known to fit this distribution accurately. In a previous paper, we explored the properties of this distribution in detail. Here we will use these properties to show why the average length of components in general and proteins in particular is highly conserved, howsoever measured, demonstrating this on various aggregations of proteins taken from the UniProt database. We will go on to define departures from this equilibrium state, identifying fine structure in the average length of eukaryotic proteins that result from evolutionary processes.

Download