بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

When do stepwise algorithms meet subset selection criteria?

67 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xiaoming Huo

تاريخ النشر 2007

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Xiaoming Huo - Xuelei Ni

نظرية الإحصاء نظرية الإحصاء

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recent results in homotopy and solution paths demonstrate that certain well-designed greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset selection (including $C_p$, AIC, BIC, MDL, RIC, etc.) involve optimizing an objective function that contains a counting measure. The two optimization problems are formulated as (P1) and (P0) in the present paper. The latter is generally combinatoric and has been proven to be NP-hard. We study the conditions under which the two optimization problems have common solutions. Hence, in these situations a stepwise algorithm can be used to solve the seemingly unsolvable problem. Our main result is motivated by recent work in sparse representation, while two others emerge from different angles: a direct analysis of sufficiency and necessity and a condition on the mostly correlated covariates. An extreme example connected with least angle regression is of independent interest.

قيم البحث

126 - Shuli Jiang , Dongyu Li , Irene Mengze Li 2021

We give the first single-pass streaming algorithm for Column Subset Selection with respect to the entrywise $ell_p$-norm with $1 leq p < 2$. We study the $ell_p$ norm loss since it is often considered more robust to noise than the standard Frobenius norm. Given an input matrix $A in mathbb{R}^{d times n}$ ($n gg d$), our algorithm achieves a multiplicative $k^{frac{1}{p} - frac{1}{2}}text{poly}(log nd)$-approximation to the error with respect to the best possible column subset of size $k$. Furthermore, the space complexity of the streaming algorithm is optimal up to a logarithmic factor. Our streaming algorithm also extends naturally to a 1-round distributed protocol with nearly optimal communication cost. A key ingredient in our algorithms is a reduction to column subset selection in the $ell_{p,2}$-norm, which corresponds to the $p$-norm of the vector of Euclidean norms of each of the columns of $A$. This enables us to leverage strong coreset constructions for the Euclidean norm, which previously had not been applied in this context. We also give the first provable guarantees for greedy column subset selection in the $ell_{1, 2}$ norm, which can be used as an alternative, practical subroutine in our algorithms. Finally, we show that our algorithms give significant practical advantages on real-world data analysis tasks.

بنى وهياكل البيانات والخوارزميات

Semiparametric estimation of mutual information and related criteria : optimal test of independence

70 - Amor Keziou , Philippe Regnault 2015

We derive independence tests by means of dependence measures thresholding in a semiparametric context. Precisely, estimates of phi-mutual informations, associated to phi-divergences between a joint distribution and the product distribution of its mar gins, are derived through the dual representation of phi-divergences. The asymptotic properties of the proposed estimates are established, including consistency, asymptotic distributions and large deviations principle. The obtained tests of independence are compared via their relative asymptotic Bahadur efficiency and numerical simulations. It follows that the proposed semiparametric Kullback-Leibler Mutual information test is the optimal one. On the other hand, the proposed approach provides a new method for estimating the Kullback-Leibler mutual information in a semiparametric setting, as well as a model selection procedure in large class of dependency models including semiparametric copulas.

نظرية الإحصاء نظرية الإحصاء

Reconciling Model Selection and Prediction

166 - George Casella , Guido Consonni 2009

It is known that there is a dichotomy in the performance of model selectors. Those that are consistent (having the oracle property) do not achieve the asymptotic minimax rate for prediction error. We look at this phenomenon closely, and argue that th e set of parameters on which this dichotomy occurs is extreme, even pathological, and should not be considered when evaluating model selectors. We characterize this set, and show that, when such parameters are dismissed from consideration, consistency and asymptotic minimaxity can be attained simultaneously.

نظرية الإحصاء نظرية الإحصاء

The Directions of Selection Bias

67 - Zhichao Jiang , Peng Ding 2016

We show that if the exposure and the outcome affect the selection indicator in the same direction and have non-positive interaction on the risk difference, risk ratio or odds ratio scale, the exposure-outcome odds ratio in the selected population is a lower bound for true odds ratio.

نظرية الإحصاء نظرية الإحصاء

Over-parametrized deep neural networks do not generalize well

63 - Michael Kohler , Adam Krzyzak 2019

Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not generalize well on a new data in the sense that they do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.

نظرية الإحصاء نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الرشيد الدولية الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

When do stepwise algorithms meet subset selection criteria?

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً