No Arabic abstract
Ensembles of geophysical models improve projection accuracy and express uncertainties. We develop a novel data-driven ensembling strategy for combining geophysical models using Bayesian Neural Networks, which infers spatiotemporally varying model weights and bias while accounting for heteroscedastic uncertainties in the observations. This produces more accurate and uncertainty-aware projections without sacrificing interpretability. Applied to the prediction of total column ozone from an ensemble of 15 chemistry-climate models, we find that the Bayesian neural network ensemble (BayNNE) outperforms existing ensembling methods, achieving a 49.4% reduction in RMSE for temporal extrapolation, and a 67.4% reduction in RMSE for polar data voids, compared to a weighted mean. Uncertainty is also well-characterized, with 90.6% of the data points in our extrapolation validation dataset lying within 2 standard deviations and 98.5% within 3 standard deviations.
We conduct a thorough analysis of the relationship between the out-of-sample performance and the Bayesian evidence (marginal likelihood) of Bayesian neural networks (BNNs), as well as looking at the performance of ensembles of BNNs, both using the Boston housing dataset. Using the state-of-the-art in nested sampling, we numerically sample the full (non-Gaussian and multimodal) network posterior and obtain numerical estimates of the Bayesian evidence, considering network models with up to 156 trainable parameters. The networks have between zero and four hidden layers, either $tanh$ or $ReLU$ activation functions, and with and without hierarchical priors. The ensembles of BNNs are obtained by determining the posterior distribution over networks, from the posterior samples of individual BNNs re-weighted by the associated Bayesian evidence values. There is good correlation between out-of-sample performance and evidence, as well as a remarkable symmetry between the evidence versus model size and out-of-sample performance versus model size planes. Networks with $ReLU$ activation functions have consistently higher evidences than those with $tanh$ functions, and this is reflected in their out-of-sample performance. Ensembling over architectures acts to further improve performance relative to the individual BNNs.
One of the most crucial tasks in seismic reflection imaging is to identify the salt bodies with high precision. Traditionally, this is accomplished by visually picking the salt/sediment boundaries, which requires a great amount of manual work and may introduce systematic bias. With recent progress of deep learning algorithm and growing computational power, a great deal of efforts have been made to replace human effort with machine power in salt body interpretation. Currently, the method of Convolutional neural networks (CNN) is revolutionizing the computer vision field and has been a hot topic in the image analysis. In this paper, the benefits of CNN-based classification are demonstrated by using a state-of-art network structure U-Net, along with the residual learning framework ResNet, to delineate salt body with high precision. Network adjustments, including the Exponential Linear Units (ELU) activation function, the Lov{a}sz-Softmax loss function, and stratified $K$-fold cross-validation, have been deployed to further improve the prediction accuracy. The preliminary result using SEG Advanced Modeling (SEAM) data shows good agreement between the predicted salt body and manually interpreted salt body, especially in areas with weak reflections. This indicates the great potential of applying CNN for salt-related interpretations.
Smart thermostats are one of the most prevalent home automation products. They learn occupant preferences and schedules, and utilize an accurate thermal model to reduce the energy use of heating and cooling equipment while maintaining the temperature for maximum comfort. Despite the importance of having an accurate thermal model for the operation of smart thermostats, fast and reliable identification of this model is still an open problem. In this paper, we explore various techniques for establishing a suitable thermal model using time series data generated by smart thermostats. We show that Bayesian neural networks can be used to estimate parameters of a grey-box thermal model if sufficient training data is available, and this model outperforms several black-box models in terms of the temperature prediction accuracy. Leveraging real data from 8,884 homes equipped with smart thermostats, we discuss how the prior knowledge about the model parameters can be utilized to quickly build an accurate thermal model for another home with similar floor area and age in the same climate zone. Moreover, we investigate how to adapt the model originally built for the same home in another season using a small amount of data collected in this season. Our results confirm that maintaining only a small number of pre-trained thermal models will suffice to quickly build accurate thermal models for many other homes, and that 1~day smart thermostat data could significantly improve the accuracy of transferred models in another season.
We perform scalable approximate inference in a continuous-depth Bayesian neural network family. In this model class, uncertainty about separate weights in each layer gives hidden units that follow a stochastic differential equation. We demonstrate gradient-based stochastic variational inference in this infinite-parameter setting, producing arbitrarily-flexible approximate posteriors. We also derive a novel gradient estimator that approaches zero variance as the approximate posterior over weights approaches the true posterior. This approach brings continuous-depth Bayesian neural nets to a competitive comparison against discrete-depth alternatives, while inheriting the memory-efficient training and tunable precision of Neural ODEs.
We develop variational Laplace for Bayesian neural networks (BNNs) which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling of the neural-network weights. The Variational Laplace objective is simple to evaluate, as it is (in essence) the log-likelihood, plus weight-decay, plus a squared-gradient regularizer. Variational Laplace gave better test performance and expected calibration errors than maximum a-posteriori inference and standard sampling-based variational inference, despite using the same variational approximate posterior. Finally, we emphasise care needed in benchmarking standard VI as there is a risk of stopping before the variance parameters have converged. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.