No Arabic abstract
The genomic ssRNA of coronaviruses is packaged within a helical nucleocapsid. Due to transitional symmetry of a helix, weakly specific cooperative interaction between ssRNA and nucleocapsid proteins leads to the natural selection of specific quasi-periodic assembly/packaging signals in the related genomic sequence. Such signals coordinated with the nucleocapsid helical structure were detected and reconstructed in the genomes of the coronaviruses SARS-CoV and SARS-CoV-2. The main period of the signals for both viruses was about 54 nt, that implies 6.75 nt per N protein. The complete coverage of ssRNA genome of length about 30,000 nt by the nucleocapsid would need 4,400 N proteins, that makes them the most abundant among the structural proteins. The repertoires of motifs for SARS-CoV and SARS-CoV-2 were divergent but nearly coincided for different isolates of SARS-CoV-2. We obtained the distributions of assembly/packaging signals over the genomes with non-overlapping windows of width 432 nt. Finally, using the spectral entropy, we compared the load from point mutations and indels during virus age for SARS-CoV and SARS-CoV-2. We found the higher mutational load on SARS-CoV. In this sense, SARS-CoV-2 can be treated as a newborn virus. These observations may be helpful in practical medical applications and are of basic interest.
A world-wide COVID-19 pandemic intensified strongly the studies of molecular mechanisms related to the coronaviruses. The origin of coronaviruses and the risks of human-to-human, animal-to-human, and human-to-animal transmission of coronaviral infections can be understood only on a broader evolutionary level by detailed comparative studies. In this paper, we studied ribonucleocapsid assembly-packaging signals (RNAPS) in the genomes of all seven known pathogenic human coronaviruses, SARS-CoV, SARS-CoV-2, MERS-CoV, HCoV-OC43, HCoV-HKU1, HCoV-229E, and HCoV-NL63 and compared them with RNAPS in the genomes of the related animal coronaviruses including SARS-Bat-CoV, MERS-Camel-CoV, MHV, Bat-CoV MOP1, TGEV, and one of camel alphacoronaviruses. RNAPS in the genomes of coronaviruses were evolved due to weakly specific interactions between genomic RNA and N proteins in helical nucleocapsids. Combining transitional genome mapping and Jaccard correlation coefficients allows us to perform the analysis directly in terms of underlying motifs distributed over the genome. In all coronaviruses RNAPS were distributed quasi-periodically over the genome with the period about 54 nt biased to 57 nt and to 51 nt for the genomes longer and shorter than that of SARS-CoV, respectively. The comparison with the experimentally verified packaging signals for MERS-CoV, MHV, and TGEV proved that the distribution of particular motifs is strongly correlated with the packaging signals. We also found that many motifs were highly conserved in both characters and positioning on the genomes throughout the lineages that make them promising therapeutic targets. The mechanisms of encapsidation can affect the recombination and co-infection as well.
The coronavirus disease (COVID-19) pandemic, caused by the coronavirus SARS-CoV-2, has caused 60 millions of infections and 1.38 millions of fatalities. Genomic analysis of SARS-CoV-2 can provide insights on drug design and vaccine development for controlling the pandemic. Inverted repeats in a genome greatly impact the stability of the genome structure and regulate gene expression. Inverted repeats involve cellular evolution and genetic diversity, genome arrangements, and diseases. Here, we investigate the inverted repeats in the coronavirus SARS-CoV-2 genome. We found that SARS-CoV-2 genome has an abundance of inverted repeats. The inverted repeats are mainly located in the gene of the Spike protein. This result suggests the Spike protein gene undergoes recombination events, therefore, is essential for fast evolution. Comparison of the inverted repeat signatures in human and bat coronaviruses suggest that SARS-CoV-2 is mostly related SARS-related coronavirus, SARSr-CoV/RaTG13. The study also reveals that the recent SARS-related coronavirus, SARSr-CoV/RmYN02, has a high amount of inverted repeats in the spike protein gene. Besides, this study demonstrates that the inverted repeat distribution in a genome can be considered as the genomic signature. This study highlights the significance of inverted repeats in the evolution of SARS-CoV-2 and presents the inverted repeats as the genomic signature in genome analysis.
We examine a pair of graph generative models for the therapeutic design of novel drug candidates targeting SARS-CoV-2 viral proteins. Due to a sense of urgency, we chose well-validated models with unique strengths: an autoencoder that generates molecules with similar structures to a dataset of drugs with anti-SARS activity and a reinforcement learning algorithm that generates highly novel molecules. During generation, we explore optimization toward several design targets to balance druglikeness, synthetic accessability, and anti-SARS activity based on icfifty. This generative frameworkfootnote{https://github.com/exalearn/covid-drug-design} will accelerate drug discovery in future pandemics through the high-throughput generation of targeted therapeutic candidates.
The emerging global infectious COVID-19 coronavirus disease by novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) presents critical threats to global public health and the economy since it was identified in late December 2019 in China. The virus has gone through various pathways of evolution. For understanding the evolution and transmission of SARS-CoV-2, genotyping of virus isolates is of great importance. We present an accurate method for effectively genotyping SARS-CoV-2 viruses using complete genomes. The method employs the multiple sequence alignments of the genome isolates with the SARS-CoV-2 reference genome. The SNP genotypes are then measured by Jaccard distances to track the relationship of virus isolates. The genotyping analysis of SARS-CoV-2 isolates from the globe reveals that specific multiple mutations are the predominated mutation type during the current epidemic. Our method serves a promising tool for monitoring and tracking the epidemic of pathogenic viruses in their gradual and local genetic variations. The genotyping analysis shows that the genes encoding the S proteins and RNA polymerase, RNA primase, and nucleoprotein, undergo frequent mutations. These mutations are critical for vaccine development in disease control.
CoV2019 has evolved to be much more dangerous than CoV2003. Experiments suggest that structural rearrangements dramatically enhance CoV2019 activity. We identify a new first stage of infection which precedes structural rearrangements by using biomolecular evolutionary theory to identify sequence differences enhancing viral attachment rates. We find a small cluster of mutations which show that CoV-2 has a new feature that promotes much stronger viral attachment and enhances contagiousness. The extremely dangerous dynamics of human coronavirus infection is a dramatic example of evolutionary approach of self-organized networks to criticality. It may favor a very successful vaccine. The identified mutations can be used to test the present theory experimentally.