Modeling Atmospheric Data and Identifying Dynamics: Temporal Data-Driven Modeling of Air Pollutants


Abstract in English

Atmospheric modeling has recently experienced a surge with the advent of deep learning. Most of these models, however, predict concentrations of pollutants following a data-driven approach in which the physical laws that govern their behaviors and relationships remain hidden. With the aid of real-world air quality data collected hourly in different stations throughout Madrid, we present an empirical approach using data-driven techniques with the following goals: (1) Find parsimonious systems of ordinary differential equations via sparse identification of nonlinear dynamics (SINDy) that model the concentration of pollutants and their changes over time; (2) assess the performance and limitations of our models using stability analysis; (3) reconstruct the time series of chemical pollutants not measured in certain stations using delay coordinate embedding results. Our results show that Akaikes Information Criterion can work well in conjunction with best subset regression as to find an equilibrium between sparsity and goodness of fit. We also find that, due to the complexity of the chemical system under study, identifying the dynamics of this system over longer periods of time require higher levels of data filtering and smoothing. Stability analysis for the reconstructed ordinary differential equations (ODEs) reveals that more than half of the physically relevant critical points are saddle points, suggesting that the system is unstable even under the idealized assumption that all environmental conditions are constant over time.

Download