Retro Drug Design: From Target Properties to Molecular Structures

80 0 0.0 ( 0 )

Download Cite

Added by Yuhong Wang

Publication date 2021

fields Biology

and research's language is English

Authors Yuhong Wang - Sam Michael - Ruili Huang

Biomolecules

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

To generate drug molecules of desired properties with computational methods is the holy grail in pharmaceutical research. Here we describe an AI strategy, retro drug design, or RDD, to generate novel small molecule drugs from scratch to meet predefined requirements, including but not limited to biological activity against a drug target, and optimal range of physicochemical and ADMET properties. Traditional predictive models were first trained over experimental data for the target properties, using an atom typing based molecular descriptor system, ATP. Monte Carlo sampling algorithm was then utilized to find the solutions in the ATP space defined by the target properties, and the deep learning model of Seq2Seq was employed to decode molecular structures from the solutions. To test feasibility of the algorithm, we challenged RDD to generate novel drugs that can activate {mu} opioid receptor (MOR) and penetrate blood brain barrier (BBB). Starting from vectors of random numbers, RDD generated 180,000 chemical structures, of which 78% were chemically valid. About 42,000 (31%) of the valid structures fell into the property space defined by MOR activity and BBB permeability. Out of the 42,000 structures, only 267 chemicals were commercially available, indicating a high extent of novelty of the AI-generated compounds. We purchased and assayed 96 compounds, and 25 of which were found to be MOR agonists. These compounds also have excellent BBB scores. The results presented in this paper illustrate that RDD has potential to revolutionize the current drug discovery process and create novel structures with multiple desired properties, including biological functions and ADMET properties. Availability of an AI-enabled fast track in drug discovery is essential to cope with emergent public health threat, such as pandemic of COVID-19.

rate research

Toward Drug-Target Interaction Prediction via Ensemble Modeling and Transfer Learning

125 - Po-Yu Kao , Shu-Min Kao , Nan-Lan Huang 2021

Drug-target interaction (DTI) prediction plays a crucial role in drug discovery, and deep learning approaches have achieved state-of-the-art performance in this field. We introduce an ensemble of deep learning models (EnsembleDLM) for DTI prediction. EnsembleDLM only uses the sequence information of chemical compounds and proteins, and it aggregates the predictions from multiple deep neural networks. This approach not only achieves state-of-the-art performance in Davis and KIBA datasets but also reaches cutting-edge performance in the cross-domain applications across different bio-activity types and different protein classes. We also demonstrate that EnsembleDLM achieves a good performance (Pearson correlation coefficient and concordance index > 0.8) in the new domain with approximately 50% transfer learning data, i.e., the training set has twice as much data as the test set.

Biomolecules Machine Learning Quantitative Methods

Simulated Epidemics in 3D Protein Structures to Detect Functional Properties

50 - Mattia Miotto , Lorenzo Di Rienzo , Pietro Corsi 2019

The outcome of an epidemic is closely related to the network of interactions between the individuals. Likewise, protein functions depend on the 3D arrangement of their residues and on the underlying energetic interaction network. Borrowing ideas from the theoretical framework that has been developed to address the spreading of real diseases, we study the diffusion of a fictitious epidemic inside the protein non-bonded interaction network. Our approach allowed to probe the overall stability and the capability to propagate information in the complex 3D-structures and proved to be very efficient in addressing different problems, from the assessment of thermal stability to the identification of allosteric sites.

Biomolecules

Design Of Drug-Like Protein-Protein Interaction Stabilizers Guided By Chelation-Controlled Bioactive Conformation Stabilization

89 - Francesco Bosica , Jo~ao Filipe Nevesn (ERL 9002 - BSI 2020

The protein-protein interactions (PPIs) of 14-3-3 proteins are a model system for studying PPI stabilization. The complex natural product Fusicoccin A stabilizes many 14-3-3 PPIs but is not amenable for use in SAR studies, motivating the search for more drug-like chemical matter. However, drug-like 14-3-3 PPI stabilizers enabling such study have remained elusive. An X-ray crystal structure of a PPI in complex with an extremely low potency stabilizer uncovered an unexpected non-protein interacting, ligand-chelated Mg 2+ leading to the discovery of metal ion-dependent 14-3-3 PPI stabilization potency. This originates from a novel chelation-controlled bioactive conformation stabilization effect. Metal chelation has been associated with pan-assay interference compounds (PAINS) and frequent hitter behavior, but chelation can evidently also lead to true potency gains and find use as a medicinal chemistry strategy to guide compound optimization. To demonstrate this, we exploited the effect to design the first potent, selective and drug-like 14-3-3 PPI stabilizers.

Biomolecules

Unveiling the molecular mechanism of SARS-CoV-2 main protease inhibition from 92 crystal structures

84 - Duc D Nguyen , Kaifu Gao , Jiahui Chen 2020

Currently, there is no effective antiviral drugs nor vaccine for coronavirus disease 2019 (COVID-19) caused by acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Due to its high conservativeness and low similarity with human genes, SARS-CoV-2 main protease (M$^{text{pro}}$) is one of the most favorable drug targets. However, the current understanding of the molecular mechanism of M$^{text{pro}}$ inhibition is limited by the lack of reliable binding affinity ranking and prediction of existing structures of M$^{text{pro}}$-inhibitor complexes. This work integrates mathematics and deep learning (MathDL) to provide a reliable ranking of the binding affinities of 92 SARS-CoV-2 M$^{text{pro}}$ inhibitor structures. We reveal that Gly143 residue in M$^{text{pro}}$ is the most attractive site to form hydrogen bonds, followed by Cys145, Glu166, and His163. We also identify 45 targeted covalent bonding inhibitors. Validation on the PDBbind v2016 core set benchmark shows the MathDL has achieved the top performance with Pearsons correlation coefficient ($R_p$) being 0.858. Most importantly, MathDL is validated on a carefully curated SARS-CoV-2 inhibitor dataset with the averaged $R_p$ as high as 0.751, which endows the reliability of the present binding affinity prediction. The present binding affinity ranking, interaction analysis, and fragment decomposition offer a foundation for future drug discovery efforts.

Biomolecules Quantitative Methods

Generative network complex (GNC) for drug discovery

130 - Christopher Grow , Kaifu Gao , Duc Duy Nguyen 2019

It remains a challenging task to generate a vast variety of novel compounds with desirable pharmacological properties. In this work, a generative network complex (GNC) is proposed as a new platform for designing novel compounds, predicting their physical and chemical properties, and selecting potential drug candidates that fulfill various druggable criteria such as binding affinity, solubility, partition coefficient, etc. We combine a SMILES string generator, which consists of an encoder, a drug-property controlled or regulated latent space, and a decoder, with verification deep neural networks, a target-specific three-dimensional (3D) pose generator, and mathematical deep learning networks to generate new compounds, predict their drug properties, construct 3D poses associated with target proteins, and reevaluate druggability, respectively. New compounds were generated in the latent space by either randomized output, controlled output, or optimized output. In our demonstration, 2.08 million and 2.8 million novel compounds are generated respectively for Cathepsin S and BACE targets. These new compounds are very different from the seeds and cover a larger chemical space. For potentially active compounds, their 3D poses are generated using a state-of-the-art method. The resulting 3D complexes are further evaluated for druggability by a championing deep learning algorithm based on algebraic topology, differential geometry, and algebraic graph theories. Performed on supercomputers, the whole process took less than one week. Therefore, our GNC is an efficient new paradigm for discovering new drug candidates.

Biomolecules