Human Y-chromosome gene classification using Fractal Dimension & Shannon Entropy


Abstract in English

All genes on the human Y-chromosome were studied using fractal dimension and Shannon entropy. Clear outlier clusters were identified. Among these were 6 sequences that have since been withdrawn as CDSs and 1 additional sequence that is not in the current assembly. A methodology for ranking the sequences based on deviation from average values of FD and SE was developed. The group of sequences scored among the 10% largest deviations had abnormally high likelihood to be from centromeric or pseudoautosomal regions and low likelihood to be from X-chromosome transposed regions. lncRNA sequences were also enriched among the outliers. In addition, the number of expressed genes previously identified for evolutionary study tended to not have large deviations from the average. Keywords: Y-chromosome; Shannon di-nucleotide entropy; fractal dimension; centromeric genes; gene degredation; lncRNA

Download