No Arabic abstract
Recently, molecular fingerprints extracted from three-dimensional (3D) structures using advanced mathematics, such as algebraic topology, differential geometry, and graph theory have been paired with efficient machine learning, especially deep learning algorithms to outperform other methods in drug discovery applications and competitions. This raises the question of whether classical 2D fingerprints are still valuable in computer-aided drug discovery. This work considers 23 datasets associated with four typical problems, namely protein-ligand binding, toxicity, solubility and partition coefficient to assess the performance of eight 2D fingerprints. Advanced machine learning algorithms including random forest, gradient boosted decision tree, single-task deep neural network and multitask deep neural network are employed to construct efficient 2D-fingerprint based models. Additionally, appropriate consensus models are built to further enhance the performance of 2D-fingerprintbased methods. It is demonstrated that 2D-fingerprint-based models perform as well as the state-of-the-art 3D structure-based models for the predictions of toxicity, solubility, partition coefficient and protein-ligand binding affinity based on only ligand information. However, 3D structure-based models outperform 2D fingerprint-based methods in complex-based protein-ligand binding affinity predictions.
It remains a challenging task to generate a vast variety of novel compounds with desirable pharmacological properties. In this work, a generative network complex (GNC) is proposed as a new platform for designing novel compounds, predicting their physical and chemical properties, and selecting potential drug candidates that fulfill various druggable criteria such as binding affinity, solubility, partition coefficient, etc. We combine a SMILES string generator, which consists of an encoder, a drug-property controlled or regulated latent space, and a decoder, with verification deep neural networks, a target-specific three-dimensional (3D) pose generator, and mathematical deep learning networks to generate new compounds, predict their drug properties, construct 3D poses associated with target proteins, and reevaluate druggability, respectively. New compounds were generated in the latent space by either randomized output, controlled output, or optimized output. In our demonstration, 2.08 million and 2.8 million novel compounds are generated respectively for Cathepsin S and BACE targets. These new compounds are very different from the seeds and cover a larger chemical space. For potentially active compounds, their 3D poses are generated using a state-of-the-art method. The resulting 3D complexes are further evaluated for druggability by a championing deep learning algorithm based on algebraic topology, differential geometry, and algebraic graph theories. Performed on supercomputers, the whole process took less than one week. Therefore, our GNC is an efficient new paradigm for discovering new drug candidates.
Relativistic Invariance might be modified by Quantum Gravity effects. The interesting point which emerged in the last fifteen years is that remnants of possible Lorentz Invariance Violations could be present at energies much lower than their natural scale, and possibly affect Ultra High Energy Cosmic Rays phenomena. We discuss their status in the view of recent data from the Pierre Auger Observatory.
To generate drug molecules of desired properties with computational methods is the holy grail in pharmaceutical research. Here we describe an AI strategy, retro drug design, or RDD, to generate novel small molecule drugs from scratch to meet predefined requirements, including but not limited to biological activity against a drug target, and optimal range of physicochemical and ADMET properties. Traditional predictive models were first trained over experimental data for the target properties, using an atom typing based molecular descriptor system, ATP. Monte Carlo sampling algorithm was then utilized to find the solutions in the ATP space defined by the target properties, and the deep learning model of Seq2Seq was employed to decode molecular structures from the solutions. To test feasibility of the algorithm, we challenged RDD to generate novel drugs that can activate {mu} opioid receptor (MOR) and penetrate blood brain barrier (BBB). Starting from vectors of random numbers, RDD generated 180,000 chemical structures, of which 78% were chemically valid. About 42,000 (31%) of the valid structures fell into the property space defined by MOR activity and BBB permeability. Out of the 42,000 structures, only 267 chemicals were commercially available, indicating a high extent of novelty of the AI-generated compounds. We purchased and assayed 96 compounds, and 25 of which were found to be MOR agonists. These compounds also have excellent BBB scores. The results presented in this paper illustrate that RDD has potential to revolutionize the current drug discovery process and create novel structures with multiple desired properties, including biological functions and ADMET properties. Availability of an AI-enabled fast track in drug discovery is essential to cope with emergent public health threat, such as pandemic of COVID-19.
To shorten the time required to find effective new drugs, like antivirals, a key parameter to consider is membrane permeability, as a compound intended for an intracellular target with poor permeability will have low efficacy. Here, we present a computational model that considers both drug characteristics and membrane properties for the rapid assessment of drugs permeability through the coronavirus envelope and various cellular membranes. We analyze 79 drugs that are considered as potential candidates for the treatment of SARS-CoV-2 and determine their time of permeation in different organelle membranes grouped by viral baits and mammalian processes. The computational results are correlated with experimental data, present in the literature, on bioavailability of the drugs, showing a negative correlation between fast permeation and most promising drugs. This model represents an important tool capable of evaluating how permeability affects the ability of compounds to reach both intended and unintended intracellular targets in an accurate and rapid way. The method is general and flexible and can be employed for a variety of molecules, from small drugs to nanoparticles, as well to a variety of biological membranes.
Drug-target interaction (DTI) prediction plays a crucial role in drug discovery, and deep learning approaches have achieved state-of-the-art performance in this field. We introduce an ensemble of deep learning models (EnsembleDLM) for DTI prediction. EnsembleDLM only uses the sequence information of chemical compounds and proteins, and it aggregates the predictions from multiple deep neural networks. This approach not only achieves state-of-the-art performance in Davis and KIBA datasets but also reaches cutting-edge performance in the cross-domain applications across different bio-activity types and different protein classes. We also demonstrate that EnsembleDLM achieves a good performance (Pearson correlation coefficient and concordance index > 0.8) in the new domain with approximately 50% transfer learning data, i.e., the training set has twice as much data as the test set.