Voting Data-Driven Regression Learning for Discovery of Functional Materials and Applications to Two-Dimensional Ferroelectric Materials


Abstract in English

Regression machine learning is widely applied to predict various materials. However, insufficient materials data usually leads to a poor performance. Here, we develop a new voting data-driven method that could generally improve the performance of regression learning model for accurately predicting properties of materials. We apply it to investigate a large family (2135) of two-dimensional hexagonal binary compounds focusing on ferroelectric properties and find that the performance of the model for electric polarization is indeed greatly improved, where 38 stable ferroelectrics with out-of-plane polarization including 31 metals and 7 semiconductors are screened out. By an unsupervised learning, actionable information such as how the number and orbital radius of valence electrons, ionic polarizability, and electronegativity of constituent atoms affect polarization was extracted. Our voting data-driven method not only reduces the size of materials data for constructing a reliable learning model but also enables to make precise predictions for targeted functional materials.

Download