ترغب بنشر مسار تعليمي؟ اضغط هنا

Arbres CART et For^ets aleatoires, Importance et selection de variables

57   0   0.0 ( 0 )
 نشر من قبل Robin Genuer
 تاريخ النشر 2016
  مجال البحث الاحصاء الرياضي
والبحث باللغة English
 تأليف Robin Genuer




اسأل ChatGPT حول البحث

Two algorithms proposed by Leo Breiman : CART trees (Classification And Regression Trees for) introduced in the first half of the 80s and random forests emerged, meanwhile, in the early 2000s, are the subject of this article. The goal is to provide each of the topics, a presentation, a theoretical guarantee, an example and some variants and extensions. After a preamble, introduction recalls objectives of classification and regression problems before retracing some predecessors of the Random Forests. Then, a section is devoted to CART trees then random forests are presented. Then, a variable selection procedure based on permutation variable importance is proposed. Finally the adaptation of random forests to the Big Data context is sketched.

قيم البحث

اقرأ أيضاً

103 - Olivier Marchal 2017
The goal of this Habilitation `a diriger des recherches is to present two different applications, namely computations of certain partition functions in probability and applications to integrable systems, of the topological recursion developed by B. E ynard and N. Orantin in 2007. Since its creation, the range of applications of the topological recursion has been growing and many results in different fields have been obtained. The first aspect that I will develop deals with the historical domain of the topological recursion: random matrix integrals. I will review the formalism of the topological recursion as well as how it can be used to obtain asymptotic $frac{1}{N}$ series expansion of various matrix integrals. In particular, a key feature of the topological recursion is that it can recover from the leading order of the asymptotic all sub-leading orders with elementary computations. This method is particularly well known and fruitful in the case of hermitian matrix integrals, but I will also show that the general method can be used to cover integrals with hard edges, integrals over unitary matrices and much more. In the end, I will also briefly mention the generalization to $beta$-ensembles. In a second chapter, I will review the connection between the topological recursion and the study of integrable systems having a Lax pair representation. Most of the results presented there will be illustrated by the case of the famous six Painleve equations. Though the formalism used in this chapter may look completely disconnected from the previous one, it is well known that the local statistics of eigenvalues in random matrix theory exhibit a universality phenomenon and that the encountered universal systems are precisely driven by some solutions of the Painlev{e} equations. As I will show, the connection can be made very explicit with the topological recursion formalism.
We describe the behaviour of the rank of the Mordell-Weil group of the Picard variety of the generic fibre of a fibration in terms of local contributions given by averaging traces of Frobenius acting on the fibres. The results give a reinterpretation of Tates conjecture (for divisors) and generalises previous results of Nagao, Rosen-Silverman and the authors.
Nous montrons que les equations du rep`ere mobile des surfaces de Bonnet conduisent `a une paire de Lax matricielle isomonodromique dordre deux pour la sixi`eme equation de Painleve. We show that the moving frame equations of Bonnet surfaces can be extrapolated to a second order, isomonodromic matrix Lax pair of the sixth Painleve equation.
In this article, we propose new Bayesian methods for selecting and estimating a sparse coefficient vector for skewed heteroscedastic response. Our novel Bayesian procedures effectively estimate the median and other quantile functions, accommodate non -local prior for regression effects without compromising ease of implementation via sampling based tools, and asymptotically select the true set of predictors even when the number of covariates increases in the same order of the sample size. We also extend our method to deal with some observations with very large errors. Via simulation studies and a re-analysis of a medical cost study with large number of potential predictors, we illustrate the ease of implementation and other practical advantages of our approach compared to existing methods for such studies.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا