No Arabic abstract
We propose a Bayesian nonparametric approach to modelling and predicting a class of functional time series with application to energy markets, based on fully observed, noise-free functional data. Traders in such contexts conceive profitable strategies if they can anticipate the impact of their bidding actions on the aggregate demand and supply curves, which in turn need to be predicted reliably. Here we propose a simple Bayesian nonparametric method for predicting such curves, which take the form of monotonic bounded step functions. We borrow ideas from population genetics by defining a class of interacting particle systems to model the functional trajectory, and develop an implementation strategy which uses ideas from Markov chain Monte Carlo and approximate Bayesian computation techniques and allows to circumvent the intractability of the likelihood. Our approach shows great adaptation to the degree of smoothness of the curves and the volatility of the functional series, proves to be robust to an increase of the forecast horizon and yields an uncertainty quantification for the functional forecasts. We illustrate the model and discuss its performance with simulated datasets and on real data relative to the Italian natural gas market.
We derive generalization error bounds for traditional time-series forecasting models. Our results hold for many standard forecasting tools including autoregressive models, moving average models, and, more generally, linear state-space models. These non-asymptotic bounds need only weak assumptions on the data-generating process, yet allow forecasters to select among competing models and to guarantee, with high probability, that their chosen model will perform well. We motivate our techniques with and apply them to standard economic and financial forecasting tools---a GARCH model for predicting equity volatility and a dynamic stochastic general equilibrium model (DSGE), the standard tool in macroeconomic forecasting. We demonstrate in particular how our techniques can aid forecasters and policy makers in choosing models which behave well under uncertainty and mis-specification.
The conditional distribution of the next outcome given the infinite past of a stationary process can be inferred from finite but growing segments of the past. Several schemes are known for constructing pointwise consistent estimates, but they all demand prohibitive amounts of input data. In this paper we consider real-valued time series and construct conditional distribution estimates that make much more efficient use of the input data. The estimates are consistent in a weak sense, and the question whether they are pointwise consistent is still open. For finite-alphabet processes one may rely on a universal data compression scheme like the Lempel-Ziv algorithm to construct conditional probability mass function estimates that are consistent in expected information divergence. Consistency in this strong sense cannot be attained in a universal sense for all stationary processes with values in an infinite alphabet, but weak consistency can. Some applications of the estimates to on-line forecasting, regression and classification are discussed.
Toxicologists are often concerned with determining the dosage to which an individual can be exposed with an acceptable risk of adverse effect. These types of studies have been conducted widely in the past, and many novel approaches have been developed. Parametric techniques utilizing ANOVA and nonlinear regression models are well represented in the literature. The biggest drawback of parametric approaches is the need to specify the correct model. Recently, there has been an interest in nonparametric approaches to tolerable dosage estimation. In this work, we focus on the monotonically decreasing dose response model where the response is a percent to control. This poses two constraints to the nonparametric approach. The doseresponse function must be one at control (dose = 0), and the function must always be positive. Here we propose a Bayesian solution to this problem using a novel class of nonparametric models. A basis function developed in this research is the Alamri Monotonic spline (AM-spline). Our approach is illustrated using both simulated data and an experimental dataset from pesticide related research at the US Environmental Protection Agency.
Infectious diseases on farms pose both public and animal health risks, so understanding how they spread between farms is crucial for developing disease control strategies to prevent future outbreaks. We develop novel Bayesian nonparametric methodology to fit spatial stochastic transmission models in which the infection rate between any two farms is a function that depends on the distance between them, but without assuming a specified parametric form. Making nonparametric inference in this context is challenging since the likelihood function of the observed data is intractable because the underlying transmission process is unobserved. We adopt a fully Bayesian approach by assigning a transformed Gaussian Process prior distribution to the infection rate function, and then develop an efficient data augmentation Markov Chain Monte Carlo algorithm to perform Bayesian inference. We use the posterior predictive distribution to simulate the effect of different disease control methods and their economic impact. We analyse a large outbreak of Avian Influenza in the Netherlands and infer the between-farm infection rate, as well as the unknown infection status of farms which were pre-emptively culled. We use our results to analyse ring-culling strategies, and conclude that although effective, ring-culling has limited impact in high density areas.
In this paper, we introduce a method for segmenting time series data using tools from Bayesian nonparametrics. We consider the task of temporal segmentation of a set of time series data into representative stationary segments. We use Gaussian process (GP) priors to impose our knowledge about the characteristics of the underlying stationary segments, and use a nonparametric distribution to partition the sequences into such segments, formulated in terms of a prior distribution on segment length. Given the segmentation, the model can be viewed as a variant of a Gaussian mixture model where the mixture components are described using the covariance function of a GP. We demonstrate the effectiveness of our model on synthetic data as well as on real time-series data of heartbeats where the task is to segment the indicative types of beats and to classify the heartbeat recordings into classes that correspond to healthy and abnormal heart sounds.