No Arabic abstract
Molecular graph generation is a fundamental but challenging task in various applications such as drug discovery and material science, which requires generating valid molecules with desired properties. Auto-regressive models, which usually construct graphs following sequential actions of adding nodes and edges at the atom-level, have made rapid progress in recent years. However, these atom-level models ignore high-frequency subgraphs that not only capture the regularities of atomic combination in molecules but also are often related to desired chemical properties. In this paper, we propose a method to automatically discover such common substructures, which we call {em graph pieces}, from given molecular graphs. Based on graph pieces, we leverage a variational autoencoder to generate molecules in two phases: piece-level graph generation followed by bond completion. Experiments show that our graph piece variational autoencoder achieves better performance over state-of-the-art baselines on property optimization and constrained property optimization tasks with higher computational efficiency.
Leveraging domain knowledge including fingerprints and functional groups in molecular representation learning is crucial for chemical property prediction and drug discovery. When modeling the relation between graph structure and molecular properties implicitly, existing works can hardly capture structural or property changes and complex structure, with much smaller atom vocabulary and highly frequent atoms. In this paper, we propose the Contrastive Knowledge-aware GNN (CKGNN) for self-supervised molecular representation learning to fuse domain knowledge into molecular graph representation. We explicitly encode domain knowledge via knowledge-aware molecular encoder under the contrastive learning framework, ensuring that the generated molecular embeddings equipped with chemical domain knowledge to distinguish molecules with similar chemical formula but dissimilar functions. Extensive experiments on 8 public datasets demonstrate the effectiveness of our model with a 6% absolute improvement on average against strong competitors. Ablation study and further investigation also verify the best of both worlds: incorporation of chemical domain knowledge into self-supervised learning.
Gaining more comprehensive knowledge about drug-drug interactions (DDIs) is one of the most important tasks in drug development and medical practice. Recently graph neural networks have achieved great success in this task by modeling drugs as nodes and drug-drug interactions as links and casting DDI predictions as link prediction problems. However, correlations between link labels (e.g., DDI types) were rarely considered in existing works. We propose the graph energy neural network (GENN) to explicitly model link type correlations. We formulate the DDI prediction task as a structure prediction problem and introduce a new energy-based model where the energy function is defined by graph neural networks. Experiments on two real-world DDI datasets demonstrated that GENN is superior to many baselines without consideration of link type correlations and achieved $13.77%$ and $5.01%$ PR-AUC improvement on the two datasets, respectively. We also present a case study in which mname can better capture meaningful DDI correlations compared with baseline models.
De novo therapeutic design is challenged by a vast chemical repertoire and multiple constraints, e.g., high broad-spectrum potency and low toxicity. We propose CLaSS (Controlled Latent attribute Space Sampling) - an efficient computational method for attribute-controlled generation of molecules, which leverages guidance from classifiers trained on an informative latent space of molecules modeled using a deep generative autoencoder. We screen the generated molecules for additional key attributes by using deep learning classifiers in conjunction with novel features derived from atomistic simulations. The proposed approach is demonstrated for designing non-toxic antimicrobial peptides (AMPs) with strong broad-spectrum potency, which are emerging drug candidates for tackling antibiotic resistance. Synthesis and testing of only twenty designed sequences identified two novel and minimalist AMPs with high potency against diverse Gram-positive and Gram-negative pathogens, including one multidrug-resistant and one antibiotic-resistant K. pneumoniae, via membrane pore formation. Both antimicrobials exhibit low in vitro and in vivo toxicity and mitigate the onset of drug resistance. The proposed approach thus presents a viable path for faster and efficient discovery of potent and selective broad-spectrum antimicrobials.
Drug combination therapy has become a increasingly promising method in the treatment of cancer. However, the number of possible drug combinations is so huge that it is hard to screen synergistic drug combinations through wet-lab experiments. Therefore, computational screening has become an important way to prioritize drug combinations. Graph neural network have recently shown remarkable performance in the prediction of compound-protein interactions, but it has not been applied to the screening of drug combinations. In this paper, we proposed a deep learning model based on graph neural networks and attention mechanism to identify drug combinations that can effectively inhibit the viability of specific cancer cells. The feature embeddings of drug molecule structure and gene expression profiles were taken as input to multi-layer feedforward neural network to identify the synergistic drug combinations. We compared DeepDDS with classical machine learning methods and other deep learning-based methods on benchmark data set, and the leave-one-out experimental results showed that DeepDDS achieved better performance than competitive methods. Also, on an independent test set released by well-known pharmaceutical enterprise AstraZeneca, DeepDDS was superior to competitive methods by more than 16% predictive precision. Furthermore, we explored the interpretability of the graph attention network, and found the correlation matrix of atomic features revealed important chemical substructures of drugs. We believed that DeepDDS is an effective tool that prioritized synergistic drug combinations for further wet-lab experiment validation.
In this paper, we tackle the problem of measuring similarity among graphs that represent real objects with noisy data. To account for noise, we relax the definition of similarity using the maximum weighted co-$k$-plex relaxation method, which allows dissimilarities among graphs up to a predetermined level. We then formulate the problem as a novel quadratic unconstrained binary optimization problem that can be solved by a quantum annealer. The context of our study is molecular similarity where the presence of noise might be due to regular errors in measuring molecular features. We develop a similarity measure and use it to predict the mutagenicity of a molecule. Our results indicate that the relaxed similarity measure, designed to accommodate the regular errors, yields a higher prediction accuracy than the measure that ignores the noise.