ﻻ يوجد ملخص باللغة العربية
In this paper, we aim to give a tutorial for undergraduate students studying statistical methods and/or bioinformatics. The students will learn how data visualization can help in genomic sequence analysis. Students start with a fragment of genetic text of a bacterial genome and analyze its structure. By means of principal component analysis they ``discover that the information in the genome is encoded by non-overlapping triplets. Next, they learn how to find gene positions. This exercise on PCA and K-Means clustering enables active study of the basic bioinformatics notions. Appendix 1 contains program listings that go along with this exercise. Appendix 2 includes 2D PCA plots of triplet usage in moving frame for a series of bacterial genomes from GC-poor to GC-rich ones. Animated 3D PCA plots are attached as separate gif files. Topology (cluster structure) and geometry (mutual positions of clusters) of these plots depends clearly on GC-content.
The Roma people, living throughout Europe, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1000-1500 years ago. Geneti
The role of positive selection in human evolution remains controversial. On the one hand, scans for positive selection have identified hundreds of candidate loci and the genome-wide patterns of polymorphism show signatures consistent with frequent po
Because biological processes can make different loci have different evolutionary histories, species tree estimation requires multiple loci from across the genome. While many processes can result in discord between gene trees and species trees, incomp
The human-associated microbiome is closely tied to human health and is of substantial clinical interest. Metagenomics-based tools are emerging for clinical diagnostics, tracking the spread of diseases, and surveillance of potential pathogens. In some
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequentl