ivis Dimensionality Reduction Framework for Biomacromolecular Simulations


Abstract in English

Molecular dynamics (MD) simulations have been widely applied to study macromolecules including proteins. However, high-dimensionality of the datasets produced by simulations makes it difficult for thorough analysis, and further hinders a deeper understanding of biomacromolecules. To gain more insights into the protein structure-function relations, appropriate dimensionality reduction methods are needed to project simulations onto low-dimensional spaces. Linear dimensionality reduction methods, such as principal component analysis (PCA) and time-structure based independent component analysis (t-ICA), could not preserve sufficient structural information. Though better than linear methods, nonlinear methods, such as t-distributed stochastic neighbor embedding (t-SNE), still suffer from the limitations in avoiding system noise and keeping inter-cluster relations. ivis is a novel deep learning-based dimensionality reduction method originally developed for single-cell datasets. Here we applied this framework for the study of light, oxygen and voltage (LOV) domain of diatom Phaeodactylum tricornutum aureochrome 1a (PtAu1a). Compared with other methods, ivis is shown to be superior in constructing Markov state model (MSM), preserving information of both local and global distances and maintaining similarity between high dimension and low dimension with the least information loss. Moreover, ivis framework is capable of providing new prospective for deciphering residue-level protein allostery through the feature weights in the neural network. Overall, ivis is a promising member in the analysis toolbox for proteins.

Download