On the Completeness of Atomic Structure Representations


Abstract in English

Many-body descriptors are widely used to represent atomic environments in the construction of machine learned interatomic potentials and more broadly for fitting, classification and embedding tasks on atomic structures. It was generally believed that 3-body descriptors uniquely specify the environment of an atom, up to a rotation and permutation of like atoms. We produce several counterexamples to this belief, with the consequence that any classifier, regression or embedding model for atom-centred properties that uses 3 (or 4)-body features will incorrectly give identical results for different configurations. Writing global properties (such as total energies) as a sum of many atom-centred contributions mitigates, but does not eliminate, the impact of this fundamental deficiency -- explaining the success of current machine-learning force fields. We anticipate the issues that will arise as the desired accuracy increases, and suggest potential solutions.

Download