بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Restricting exchangeable nonparametric distributions

509 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sinead Williamson

تاريخ النشر 2012

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Sinead Williamson - Zoubin Ghahramani - Steven N. MacEachern

المنهجية التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Distributions over exchangeable matrices with infinitely many columns, such as the Indian buffet process, are useful in constructing nonparametric latent variable models. However, the distribution implied by such models over the number of features exhibited by each data point may be poorly- suited for many modeling tasks. In this paper, we propose a class of exchangeable nonparametric priors obtained by restricting the domain of existing models. Such models allow us to specify the distribution over the number of features per data point, and can achieve better performance on data sets where the number of features is not well-modeled by the original distribution.

قيم البحث

84 - Alden Green , Cosma Rohilla Shalizi 2017

We introduce two new bootstraps for exchangeable random graphs. One, the empirical graphon, is based purely on resampling, while the other, the histogram stochastic block model, is a model-based sieve bootstrap. We show that both of them accurately a pproximate the sampling distributions of motif densities, i.e., of the normalized counts of the number of times fixed subgraphs appear in the network. These densities characterize the distribution of (infinite) exchangeable networks. Our bootstraps therefore give, for the first time, a valid quantification of uncertainty in inferences about fundamental network statistics, and so of parameters identifiable from them.

المنهجية

An automatic procedure to determine groups of nonparametric regression curves

76 - Nora M. Villanueva , Marta Sestelo , Celestino Ordo~nez andn Javier Roca-Pardi~nas 2020

In many situations it could be interesting to ascertain whether nonparametric regression curves can be grouped, especially when confronted with a considerable number of curves. The proposed testing procedure allows to determine groups with an automat ic selection of their number. A simulation study is presented in order to investigate the finite sample properties of the proposed methods when compared to existing alternative procedures. Finally, the applicability of the procedure to study the geometry of a tunnel by analysing a set of cross-sections is demonstrated. The results obtained show the existence of some heterogeneity in the tunnel geometry.

المنهجية التعلم الالي

Sparse-Input Neural Networks for High-dimensional Nonparametric Regression and Classification

259 - Jean Feng , Noah Simon 2017

Neural networks are usually not the tool of choice for nonparametric high-dimensional problems where the number of input features is much larger than the number of observations. Though neural networks can approximate complex multivariate functions, t hey generally require a large number of training observations to obtain reasonable fits, unless one can learn the appropriate network structure. In this manuscript, we show that neural networks can be applied successfully to high-dimensional settings if the true function falls in a low dimensional subspace, and proper regularization is used. We propose fitting a neural network with a sparse group lasso penalty on the first-layer input weights. This results in a neural net that only uses a small subset of the original features. In addition, we characterize the statistical convergence of the penalized empirical risk minimizer to the optimal neural network: we show that the excess risk of this penalized estimator only grows with the logarithm of the number of input features; and we show that the weights of irrelevant features converge to zero. Via simulation studies and data analyses, we show that these sparse-input neural networks outperform existing nonparametric high-dimensional estimation methods when the data has complex higher-order interactions.

المنهجية التعلم الالي

Efficient nonparametric statistical inference on population feature importance using Shapley values

203 - Brian D. Williamson , Jean Feng 2020

The true population-level importance of a variable in a prediction task provides useful knowledge about the underlying data-generating mechanism and can help in deciding which measurements to collect in subsequent experiments. Valid statistical infer ence on this importance is a key component in understanding the population of interest. We present a computationally efficient procedure for estimating and obtaining valid statistical inference on the Shapley Population Variable Importance Measure (SPVIM). Although the computational complexity of the true SPVIM scales exponentially with the number of variables, we propose an estimator based on randomly sampling only $Theta(n)$ feature subsets given $n$ observations. We prove that our estimator converges at an asymptotically optimal rate. Moreover, by deriving the asymptotic distribution of our estimator, we construct valid confidence intervals and hypothesis tests. Our procedure has good finite-sample performance in simulations, and for an in-hospital mortality prediction task produces similar variable importance estimates when different machine learning algorithms are applied.

المنهجية التعلم الالي

A survey of non-exchangeable priors for Bayesian nonparametric models

650 - Nicholas J. Foti , Sinead Williamson 2012

Dependent nonparametric processes extend distributions over measures, such as the Dirichlet process and the beta process, to give distributions over collections of measures, typically indexed by values in some covariate space. Such models are appropr iate priors when exchangeability assumptions do not hold, and instead we want our model to vary fluidly with some set of covariates. Since the concept of dependent nonparametric processes was formalized by MacEachern [1], there have been a number of models proposed and used in the statistics and machine learning literatures. Many of these models exhibit underlying similarities, an understanding of which, we hope, will help in selecting an appropriate prior, developing new models, and leveraging inference techniques.

التعلم الالي التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الجزيرة الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Restricting exchangeable nonparametric distributions

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً