New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Technical Note -- Knowledge Gradient for Selection with Covariates: Consistency and Computation

297 0 0.0 ( 0 )

Download Cite

Added by Xiaowei Zhang

Publication date 2019

fields

and research's language is English

Authors Liang Ding - L. Jeff Hong - Haihui Shen

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Knowledge gradient is a design principle for developing Bayesian sequential sampling policies to solve optimization problems. In this paper we consider the ranking and selection problem in the presence of covariates, where the best alternative is not universal but depends on the covariates. In this context, we prove that under minimal assumptions, the sampling policy based on knowledge gradient is consistent, in the sense that following the policy the best alternative as a function of the covariates will be identified almost surely as the number of samples grows. We also propose a stochastic gradient ascent algorithm for computing the sampling policy and demonstrate its performance via numerical experiments.

rate research

Strong consistency and optimality for generalized estimating equations with stochastic covariates

97 - Laura Dumitrescu , Ioana Schiopu-Kratina 2017

In this article we study the existence and strong consistency of GEE estimators, when the generalized estimating functions are martingales with random coefficients. Furthermore, we characterize estimating functions which are asymptotically optimal.

Statistics Theory Statistics Theory

Optimal Convergence for Stochastic Optimization with Multiple Expectation Constraints

111 - Kinjal Basu , Preetam Nandy 2019

In this paper, we focus on the problem of stochastic optimization where the objective function can be written as an expectation function over a closed convex set. We also consider multiple expectation constraints which restrict the domain of the problem. We extend the cooperative stochastic approximation algorithm from Lan and Zhou [2016] to solve the particular problem. We close the gaps in the previous analysis and provide a novel proof technique to show that our algorithm attains the optimal rate of convergence for both optimality gap and constraint violation when the functions are generally convex. We also compare our algorithm empirically to the state-of-the-art and show improved convergence in many situations.

Statistics Theory Optimization and Control Methodology

Consistency of Bayesian procedures for variable selection

576 - George Casella , F. Javier Giron , M. Lina Martinez 2009

It has long been known that for the comparison of pairwise nested models, a decision based on the Bayes factor produces a consistent model selector (in the frequentist sense). Here we go beyond the usual consistency for nested pairwise models, and show that for a wide class of prior distributions, including intrinsic priors, the corresponding Bayesian procedure for variable selection in normal regression is consistent in the entire class of normal linear models. We find that the asymptotics of the Bayes factors for intrinsic priors are equivalent to those of the Schwarz (BIC) criterion. Also, recall that the Jeffreys--Lindley paradox refers to the well-known fact that a point null hypothesis on the normal mean parameter is always accepted when the variance of the conjugate prior goes to infinity. This implies that some limiting forms of proper prior distributions are not necessarily suitable for testing problems. Intrinsic priors are limits of proper prior distributions, and for finite sample sizes they have been proved to behave extremely well for variable selection in regression; a consequence of our results is that for intrinsic priors Lindleys paradox does not arise.

Statistics Theory Statistics Theory

Consistency of archetypal analysis

56 - Braxton Osting , Dong Wang , Yiming Xu 2020

Archetypal analysis is an unsupervised learning method that uses a convex polytope to summarize multivariate data. For fixed $k$, the method finds a convex polytope with $k$ vertices, called archetype points, such that the polytope is contained in the convex hull of the data and the mean squared distance between the data and the polytope is minimal. In this paper, we prove a consistency result that shows if the data is independently sampled from a probability measure with bounded support, then the archetype points converge to a solution of the continuum version of the problem, of which we identify and establish several properties. We also obtain the convergence rate of the optimal objective values under appropriate assumptions on the distribution. If the data is independently sampled from a distribution with unbounded support, we also prove a consistency result for a modified method that penalizes the dispersion of the archetype points. Our analysis is supported by detailed computational experiments of the archetype points for data sampled from the uniform distribution in a disk, the normal distribution, an annular distribution, and a Gaussian mixture model.

Statistics Theory Optimization and Control Probability

Law of the Iterated Logarithm and Model Selection Consistency for GLMs with Independent and Dependent Responses

329 - Xiaowei Yang , Shuang Song , Huiming Zhang 2019

We study the law of the iterated logarithm (LIL) for the maximum likelihood estimation of the parameters (as a convex optimization problem) in the generalized linear models with independent or weakly dependent ($rho$-mixing, $m$-dependent) responses under mild conditions. The LIL is useful to derive the asymptotic bounds for the discrepancy between the empirical process of the log-likelihood function and the true log-likelihood. As the application of the LIL, the strong consistency of some penalized likelihood based model selection criteria can be shown. Under some regularity conditions, the model selection criterion will be helpful to select the simplest correct model almost surely when the penalty term increases with model dimension and the penalty term has an order higher than $O({rm{loglog}}n)$ but lower than $O(n)$. Simulation studies are implemented to verify the selection consistency of BIC.

Statistics Theory Probability Statistics Theory

comments

Fetching comments

Sham Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Technical Note -- Knowledge Gradient for Selection with Covariates: Consistency and Computation

Ask ChatGPT about the research

No Arabic abstract

Read More