New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Bayesian Probabilistic Numerical Integration with Tree-Based Models

95 0 0.0 ( 0 )

Download Cite

Added by Harrison Zhu

Publication date 2020

fields Mathematical Statistics Informatics Engineering

and research's language is English

Authors Harrison Zhu - Xing Liu - Ruya Kang

Methodology Machine Learning Numerical Analysis

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Bayesian quadrature (BQ) is a method for solving numerical integration problems in a Bayesian manner, which allows users to quantify their uncertainty about the solution. The standard approach to BQ is based on a Gaussian process (GP) approximation of the integrand. As a result, BQ is inherently limited to cases where GP approximations can be done in an efficient manner, thus often prohibiting very high-dimensional or non-smooth target functions. This paper proposes to tackle this issue with a new Bayesian numerical integration algorithm based on Bayesian Additive Regression Trees (BART) priors, which we call BART-Int. BART priors are easy to tune and well-suited for discontinuous functions. We demonstrate that they also lend themselves naturally to a sequential design setting and that explicit convergence rates can be obtained in a variety of settings. The advantages and disadvantages of this new methodology are highlighted on a set of benchmark tests including the Genz functions, and on a Bayesian survey design problem.

rate research

Frank-Wolfe Bayesian Quadrature: Probabilistic Integration with Theoretical Guarantees

161 - Franc{c}ois-Xavier Briol , Chris J. Oates , Mark Girolami 2015

There is renewed interest in formulating integration as an inference problem, motivated by obtaining a full distribution over numerical error that can be propagated through subsequent computation. Current methods, such as Bayesian Quadrature, demonstrate impressive empirical performance but lack theoretical analysis. An important challenge is to reconcile these probabilistic integrators with rigorous convergence guarantees. In this paper, we present the first probabilistic integrator that admits such theoretical treatment, called Frank-Wolfe Bayesian Quadrature (FWBQ). Under FWBQ, convergence to the true value of the integral is shown to be exponential and posterior contraction rates are proven to be superexponential. In simulations, FWBQ is competitive with state-of-the-art methods and out-performs alternatives based on Frank-Wolfe optimisation. Our approach is applied to successfully quantify numerical error in the solution to a challenging model choice problem in cellular biology.

Machine Learning

Tree-based Node Aggregation in Sparse Graphical Models

109 - Ines Wilms , Jacob Bien 2021

High-dimensional graphical models are often estimated using regularization that is aimed at reducing the number of edges in a network. In this work, we show how even simpler networks can be produced by aggregating the nodes of the graphical model. We develop a new convex regularized method, called the tree-aggregated graphical lasso or tag-lasso, that estimates graphical models that are both edge-sparse and node-aggregated. The aggregation is performed in a data-driven fashion by leveraging side information in the form of a tree that encodes node similarity and facilitates the interpretation of the resulting aggregated nodes. We provide an efficient implementation of the tag-lasso by using the locally adaptive alternating direction method of multipliers and illustrate our proposals practical advantages in simulation and in applications in finance and biology.

Methodology Econometrics Machine Learning

Optimality Criteria for Probabilistic Numerical Methods

88 - Chris. J. Oates , Jon Cockayne , Dennis Prangle 2019

It is well understood that Bayesian decision theory and average case analysis are essentially identical. However, if one is interested in performing uncertainty quantification for a numerical task, it can be argued that standard approaches from the decision-theoretic framework are neither appropriate nor sufficient. Instead, we consider a particular optimality criterion from Bayesian experimental design and study its implied optimal information in the numerical context. This information is demonstrated to differ, in general, from the information that would be used in an average-case-optimal numerical method. The explicit connection to Bayesian experimental design suggests several distinct regimes in which optimal probabilistic numerical methods can be developed.

Methodology

Controlling the False Split Rate in Tree-Based Aggregation

118 - Simeng Shao , Jacob Bien , Adel Javanmard 2021

In many domains, data measurements can naturally be associated with the leaves of a tree, expressing the relationships among these measurements. For example, companies belong to industries, which in turn belong to ever coarser divisions such as sectors; microbes are commonly arranged in a taxonomic hierarchy from species to kingdoms; street blocks belong to neighborhoods, which in turn belong to larger-scale regions. The problem of tree-based aggregation that we consider in this paper asks which of these tree-defined subgroups of leaves should really be treated as a single entity and which of these entities should be distinguished from each other. We introduce the false split rate, an error measure that describes the degree to which subgroups have been split when they should not have been. We then propose a multiple hypothesis testing algorithm for tree-based aggregation, which we prove controls this error measure. We focus on two main examples of tree-based aggregation, one which involves aggregating means and the other which involves aggregating regression coefficients. We apply this methodology to aggregate stocks based on their volatility and to aggregate neighborhoods of New York City based on taxi fares.

Methodology Machine Learning Statistics Theory

Bayesian hierarchical stacking: Some models are (somewhere) useful

325 - Yuling Yao , Gregor Pirv{s} , Aki Vehtari 2021

Stacking is a widely used model averaging technique that asymptotically yields optimal predictions among linear averages. We show that stacking is most effective when model predictive performance is heterogeneous in inputs, and we can further improve the stacked mixture with a hierarchical model. We generalize stacking to Bayesian hierarchical stacking. The model weights are varying as a function of data, partially-pooled, and inferred using Bayesian inference. We further incorporate discrete and continuous inputs, other structured priors, and time series and longitudinal data. To verify the performance gain of the proposed method, we derive theory bounds, and demonstrate on several applied problems.

Methodology Machine Learning Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Bayesian Probabilistic Numerical Integration with Tree-Based Models

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions