Entropy-based Active Learning of Graph Neural Network Surrogate Models for Materials Properties


Abstract in English

Graph neural networks, trained on experimental or calculated data are becoming an increasingly important tool in computational materials science. Networks, once trained, are able to make highly accurate predictions at a fraction of the cost of experiments or first-principles calculations of comparable accuracy. However these networks typically rely on large databases of labelled experiments to train the model. In scenarios where data is scarce or expensive to obtain this can be prohibitive. By building a neural network that provides a confidence on the predicted properties, we are able to develop an active learning scheme that can reduce the amount of labelled data required, by identifying the areas of chemical space where the model is most uncertain. We present a scheme for coupling a graph neural network with a Gaussian process to featurise solid-state materials and predict properties textit{including} a measure of confidence in the prediction. We then demonstrate that this scheme can be used in an active learning context to speed up the training of the model, by selecting the optimal next experiment for obtaining a data label. Our active learning scheme can double the rate at which the performance of the model on a test data set improves with additional data compared to choosing the next sample at random. This type of uncertainty quantification and active learning has the potential to open up new areas of materials science, where data are scarce and expensive to obtain, to the transformative power of graph neural networks.

Download