ﻻ يوجد ملخص باللغة العربية
Determining the aqueous solubility of molecules is a vital step in many pharmaceutical, environmental, and energy storage applications. Despite efforts made over decades, there are still challenges associated with developing a solubility prediction model with satisfactory accuracy for many of these applications. The goal of this study is to develop a general model capable of predicting the solubility of a broad range of organic molecules. Using the largest currently available solubility dataset, we implement deep learning-based models to predict solubility from molecular structure and explore several different molecular representations including molecular descriptors, simplified molecular-input line-entry system (SMILES) strings, molecular graphs, and three-dimensional (3D) atomic coordinates using four different neural network architectures - fully connected neural networks (FCNNs), recurrent neural networks (RNNs), graph neural networks (GNNs), and SchNet. We find that models using molecular descriptors achieve the best performance, with GNN models also achieving good performance. We perform extensive error analysis to understand the molecular properties that influence model performance, perform feature analysis to understand which information about molecular structure is most valuable for prediction, and perform a transfer learning and data size study to understand the impact of data availability on model performance.
We present a molecular dynamics simulation method for the computation of the solubility of organic crystals in solution. The solubility is calculated based on the equilibrium free energy difference between the solvated solute and its crystallized sta
We report a workflow and the output of a natural language processing (NLP)-based procedure to mine the extant metal-organic framework (MOF) literature describing structurally characterized MOFs and their solvent removal and thermal stabilities. We ob
Deep generative models have emerged as a powerful tool for learning informative molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. Deep generative auto-encoders de
Understanding the star-formation properties of galaxies as a function of cosmic epoch is a critical exercise in studies of galaxy evolution. Traditionally, stellar population synthesis models have been used to obtain best fit parameters that characte
Although aqueous electrolytes are among the most important solutions, the molecular simulation of their intertwined properties of chemical potentials, solubility and activity coefficients has remained a challenging problem, and has attracted consider