ﻻ يوجد ملخص باللغة العربية
Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that automates lead optimization workflow textit{in silico}. The initial data collection is achieved with physics-based molecular dynamics (MD) simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional clustering method and CVAE-based clustering method are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization. With no need for extensive structure-activity relationship database, our method provides new hints for drug modification hotspots which can be used to improve drug efficacy. Our workflow can potentially reduce the lead optimization turnaround time from months/years to days compared with the conventional labor-intensive process and thus can potentially become a valuable tool for medical researchers.
Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However,
The pipeline optimization problem in machine learning requires simultaneous optimization of pipeline structures and parameter adaptation of their elements. Having an elegant way to express these structures can help lessen the complexity in the manage
Chemistry research has both high material and computational costs to conduct experiments. Institutions thus consider chemical data to be valuable and there have been few efforts to construct large public datasets for machine learning. Another challen
In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus. Our result applies to most practical problems (i.e., definable problems) provided that a nonsmooth form
Computer-aided design (CAD) programs are essential to engineering as they allow for better designs through low-cost iterations. While CAD programs are typically taught to undergraduate students as a job skill, such software can also help students lea