No Arabic abstract
Design of new drug compounds with target properties is a key area of research in generative modeling. We present a small drug molecule design pipeline based on graph-generative models and a comparison study of two state-of-the-art graph generative models for designing COVID-19 targeted drug candidates: 1) a variational autoencoder-based approach (VAE) that uses prior knowledge of molecules that have been shown to be effective for earlier coronavirus treatments and 2) a deep Q-learning method (DQN) that generates optimized molecules without any proximity constraints. We evaluate the novelty of the automated molecule generation approaches by validating the candidate molecules with drug-protein binding affinity models. The VAE method produced two novel molecules with similar structures to the antiretroviral protease inhibitor Indinavir that show potential binding affinity for the SARS-CoV-2 protein target 3-chymotrypsin-like protease (3CL-protease).
In the past several months, COVID-19 has spread over the globe and caused severe damage to the people and the society. In the context of this severe situation, an effective drug discovery method to generate potential drugs is extremely meaningful. In this paper, we provide a methodology of discovering potential drugs for the treatment of Severe Acute Respiratory Syndrome Corona-Virus 2 (commonly known as SARS-CoV-2). We proposed a new model called Genetic Constrained Graph Variational Autoencoder (GCGVAE) to solve this problem. We trained our model based on the data of various viruses protein structure, including that of the SARS, HIV, Hep3, and MERS, and used it to generate possible drugs for SARS-CoV-2. Several optimization algorithms, including valency masking and genetic algorithm, are deployed to fine tune our model. According to the simulation, our generated molecules have great effectiveness in inhibiting SARS-CoV-2. We quantitatively calculated the scores of our generated molecules and compared it with the scores of existing drugs, and the result shows our generated molecules scores much better than those existing drugs. Moreover, our model can be also applied to generate effective drugs for treating other viruses given their protein structure, which could be used to generate drugs for future viruses.
We have entered an era of a pandemic that has shaken the world with major impact to medical systems, economics and agriculture. Prominent computational and mathematical models have been unreliable due to the complexity of the spread of infections. Moreover, lack of data collection and reporting makes any such modelling attempts unreliable. Hence we need to re-look at the situation with the latest data sources and most comprehensive forecasting models. Deep learning models such as recurrent neural networks are well suited for modelling temporal sequences. In this paper, prominent recurrent neural networks, in particular textit{long short term memory} (LSTMs) networks, bidirectional LSTM, and encoder-decoder LSTM models for multi-step (short-term) forecasting the spread of COVID-infections among selected states in India. We select states with COVID-19 hotpots in terms of the rate of infections and compare with states where infections have been contained or reached their peak and provide two months ahead forecast that shows that cases will slowly decline. Our results show that long-term forecasts are promising which motivates the application of the method in other countries or areas. We note that although we made some progress in forecasting, the challenges in modelling remain due to data and difficulty in capturing factors such as population density, travel logistics, and social aspects such culture and lifestyle.
Objective: To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. Methods: We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from both PubMed and COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant, and used this subset to construct a knowledge graph. Five SOTA, neural knowledge graph completion algorithms were used to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. Results: Accuracy classifier based on PubMedBERT achieved the best performance (F1= 0.854) in classifying semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, Hits@1=0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as some candidate drugs that have not yet been studied. Discovery patterns enabled generation of plausible hypotheses regarding the relationships between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (paclitaxel, SB 203580, alpha 2-antiplasmin, pyrrolidine dithiocarbamate, and butylated hydroxytoluene) with their mechanistic explanations were further discussed. Conclusion: We show that an LBD approach can be feasible for discovering drug candidates for COVID-19, and for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions.
The Corona Virus Disease 2019 (COVID-19) belongs to human coronaviruses (HCoVs), which spreads rapidly around the world. Compared with new drug development, drug repurposing may be the best shortcut for treating COVID-19. Therefore, we constructed a comprehensive heterogeneous network based on the HCoVs-related target proteins and use the previously proposed deepDTnet, to discover potential drug candidates for COVID-19. We obtain high performance in predicting the possible drugs effective for COVID-19 related proteins. In summary, this work utilizes a powerful heterogeneous network-based deep learning method, which may be beneficial to quickly identify candidate repurposable drugs toward future clinical trials for COVID-19. The code and data are available at https://github.com/stjin-XMU/HnDR-COVID.
To combat COVID-19, both clinicians and scientists need to digest vast amounts of relevant biomedical knowledge in scientific literature to understand the disease mechanism and related biological functions. We have developed a novel and comprehensive knowledge discovery framework, COVID-KG to extract fine-grained multimedia knowledge elements (entities and their visual chemical structures, relations, and events) from scientific literature. We then exploit the constructed multimedia knowledge graphs (KGs) for question answering and report generation, using drug repurposing as a case study. Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence.