ﻻ يوجد ملخص باللغة العربية
Fingerprint-based models for protein-ligand binding have demonstrated outstanding success on benchmark datasets; however, these models may not learn the correct binding rules. To assess this concern, we use in silico datasets with known binding rules to develop a general framework for evaluating model attribution. This framework identifies fragments that a model considers necessary to achieve a particular score, sidestepping the need for a model to be differentiable. Our results confirm that high-performing models may not learn the correct binding rule, and suggest concrete steps that can remedy this situation. We show that adding fragment-matched inactive molecules (decoys) to the data reduces attribution false negatives, while attribution false positives largely arise from the background correlation structure of molecular data. Normalizing for these background correlations helps to reveal the true binding logic. Our work highlights the danger of trusting attributions from high-performing models and suggests that a closer examination of fingerprint correlation structure and better decoy selection may help reduce misattributions.
The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a major worldwide public health emergency that has infected over $1.5$ million people. The partially open state of S1 subunit in spike glycoprotein is considered vital for its
This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and
Normal mode analysis offers an efficient way of modeling the conformational flexibility of protein structures. Simple models defined by contact topology, known as elastic network models, have been used to model a variety of systems, but the validatio
The tertiary structures of functional RNA molecules remain difficult to decipher. A new generation of automated RNA structure prediction methods may help address these challenges but have not yet been experimentally validated. Here we apply four pred
Chemical purity of RNA samples is critical for high-precision studies of RNA folding and catalytic behavior, but such purity may be compromised by photodamage accrued during ultraviolet (UV) visualization of gel-purified samples. Here, we quantitativ