Troubleshooting Unstable Molecules in Chemical Space


Abstract in English

A key challenge in automated chemical compound space explorations is ensuring veracity in minimum energy geometries---to preserve intended bonding connectivities. We discuss an iterative high-throughput workflow for connectivity preserving geometry optimizations exploiting the nearness between quantum mechanical models. The methodology is benchmarked on the QM9 dataset comprising DFT-level properties of 133,885 small molecules; of which 3,054 have questionable geometric stability. We successfully troubleshoot 2,988 molecules and ensure a bijective mapping between desired Lewis formulae and final geometries. Our workflow, based on DFT and post-DFT methods, identifies 66 molecules as unstable; 52 contain $-{rm NNO}-$, the rest are strained due to pyramidal sp$^2$ C. In the curated dataset, we inspect molecules with long CC bonds and identify ultralong contestants ($r>1.70$~AA{}) supported by topological analysis of electron density. We hope the proposed strategy to play a role in big data quantum chemistry initiatives.

Download