No Arabic abstract
An unsolved challenge in the development of antigen specific immunotherapies is determining the optimal antigens to target. Comprehension of antigen-MHC binding is paramount towards achieving this goal. Here, we present CASTELO, a combined machine learning-molecular dynamics (ML-MD) approach to design novel antigens of increased MHC binding affinity for a Type 1 diabetes (T1D)-implicated system. We build upon a small molecule lead optimization algorithm by training a convolutional variational autoencoder (CVAE) on MD trajectories of 48 different systems across 4 antigens and 4 HLA serotypes. We develop several new machine learning metrics including a structure-based anchor residue classification model as well as cluster comparison scores. ML-MD predictions agree well with experimental binding results and free energy perturbation-predicted binding affinities. Moreover, ML-MD metrics are independent of traditional MD stability metrics such as contact area and RMSF, which do not reflect binding affinity data. Our work supports the role of structure-based deep learning techniques in antigen specific immunotherapy design.
One key task in virtual screening is to accurately predict the binding affinity ($triangle$$G$) of protein-ligand complexes. Recently, deep learning (DL) has significantly increased the predicting accuracy of scoring functions due to the extraordinary ability of DL to extract useful features from raw data. Nevertheless, more efforts still need to be paid in many aspects, for the aim of increasing prediction accuracy and decreasing computational cost. In this study, we proposed a simple scoring function (called OnionNet-2) based on convolutional neural network to predict $triangle$$G$. The protein-ligand interactions are characterized by the number of contacts between protein residues and ligand atoms in multiple distance shells. Compared to published models, the efficacy of OnionNet-2 is demonstrated to be the best for two widely used datasets CASF-2016 and CASF-2013 benchmarks. The OnionNet-2 model was further verified by non-experimental decoy structures from docking program and the CSAR NRC-HiQ data set (a high-quality data set provided by CSAR), which showed great success. Thus, our study provides a simple but efficient scoring function for predicting protein-ligand binding free energy.
Idiosyncratic adverse drug reactions are unpredictable, dose independent and potentially life threatening; this makes them a major factor contributing to the cost and uncertainty of drug development. Clinical data suggest that many such reactions involve immune mechanisms, and genetic association studies have identified strong linkage between drug hypersensitivity reactions to several drugs and specific HLA alleles. One of the strongest such genetic associations found has been for the antiviral drug abacavir, which causes severe adverse reactions exclusively in patients expressing the HLA molecular variant B*57:01. Abacavir adverse reactions were recently shown to be driven by drug-specific activation of cytokine-producing, cytotoxic CD8+ T cells that required HLA-B*57:01 molecules for their function. However, the mechanism by which abacavir induces this pathologic T cell response remains unclear. Here we show that abacavir can bind within the F-pocket of the peptide-binding groove of HLA-B*57:01 thereby altering its specificity. This supports a novel explanation for HLA-linked idiosyncratic adverse drug reactions; namely that drugs can alter the repertoire of self-peptides presented to T cells thus causing the equivalent of an alloreactive T cell response. Indeed, we identified specific self-peptides that are presented only in the presence of abacavir, and that were recognized by T cells of hypersensitive patients. The assays we have established can be applied to test additional compounds with suspected HLA linked hypersensitivities in vitro. Where successful, these assays could speed up the discovery and mechanistic understanding of HLA linked hypersensitivities as well as guide the development of safer drugs.
Modern technologies are enabling scientists to collect extraordinary amounts of complex and sophisticated data across a huge range of scales like never before. With this onslaught of data, we can allow the focal point to shift towards answering the question of how we can analyze and understand the massive amounts of data in front of us. Unfortunately, lack of standardized sharing mechanisms and practices often make reproducing or extending scientific results very difficult. With the creation of data organization structures and tools which drastically improve code portability, we now have the opportunity to design such a framework for communicating extensible scientific discoveries. Our proposed solution leverages these existing technologies and standards, and provides an accessible and extensible model for reproducible research, called science in the cloud (sic). Exploiting scientific containers, cloud computing and cloud data services, we show the capability to launch a computer in the cloud and run a web service which enables intimate interaction with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results which will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended.
Summary: In anticipation of the individualized proteomics era and the need to integrate knowledge from disease studies, we have augmented our peptide identification software RAId DbS to take into account annotated single amino acid polymorphisms, post-translational modifications, and their documented disease associations while analyzing a tandem mass spectrum. To facilitate new discoveries, RAId DbS allows users to conduct searches permitting novel polymorphisms. Availability: The webserver link is http://www.ncbi.nlm.nih.gov/ /CBBResearch/qmbp/raid dbs/index.html. The relevant databases and binaries of RAId DbS for Linux, Windows, and Mac OS X are available from the same web page. Contact:
[email protected]
Accumulated clinical studies show that microbes living in humans interact closely with human hosts, and get involved in modulating drug efficacy and drug toxicity. Microbes have become novel targets for the development of antibacterial agents. Therefore, screening of microbe-drug associations can benefit greatly drug research and development. With the increase of microbial genomic and pharmacological datasets, we are greatly motivated to develop an effective computational method to identify new microbe-drug associations. In this paper, we proposed a novel method, Graph2MDA, to predict microbe-drug associations by using variational graph autoencoder (VGAE). We constructed multi-modal attributed graphs based on multiple features of microbes and drugs, such as molecular structures, microbe genetic sequences, and function annotations. Taking as input the multi-modal attribute graphs, VGAE was trained to learn the informative and interpretable latent representations of each node and the whole graph, and then a deep neural network classifier was used to predict microbe-drug associations. The hyperparameter analysis and model ablation studies showed the sensitivity and robustness of our model. We evaluated our method on three independent datasets and the experimental results showed that our proposed method outperformed six existing state-of-the-art methods. We also explored the meaningness of the learned latent representations of drugs and found that the drugs show obvious clustering patterns that are significantly consistent with drug ATC classification. Moreover, we conducted case studies on two microbes and two drugs and found 75%-95% predicted associations have been reported in PubMed literature. Our extensive performance evaluations validated the effectiveness of our proposed method.