Principled Non-Linear Feature Selection

131 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Dimitrios Athanasakis Mr

تاريخ النشر 2013

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Dimitrios Athanasakis - John Shawe-Taylor - Delmiro Fernandez-Reyes

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recent non-linear feature selection approaches employing greedy optimisation of Centred Kernel Target Alignment(KTA) exhibit strong results in terms of generalisation accuracy and sparsity. However, they are computationally prohibitive for large datasets. We propose randSel, a randomised feature selection algorithm, with attractive scaling properties. Our theoretical analysis of randSel provides strong probabilistic guarantees for correct identification of relevant features. RandSels characteristics make it an ideal candidate for identifying informative learned representations. Weve conducted experimentation to establish the performance of this approach, and present encouraging results, including a 3rd position result in the recent ICML black box learning challenge as well as competitive results for signal peptide prediction, an important problem in bioinformatics.

قيم البحث

اقرأ أيضاً

Learning Non-Linear Feature Maps

120 - Dimitrios Athanasakis , John Shawe-Taylor , Delmiro Fernandez-Reyes 2013

Feature selection plays a pivotal role in learning, particularly in areas were parsimonious features can provide insight into the underlying process, such as biology. Recent approaches for non-linear feature selection employing greedy optimisation of Centred Kernel Target Alignment(KTA), while exhibiting strong results in terms of generalisation accuracy and sparsity, can become computationally prohibitive for high-dimensional datasets. We propose randSel, a randomised feature selection algorithm, with attractive scaling properties. Our theoretical analysis of randSel provides strong probabilistic guarantees for the correct identification of relevant features. Experimental results on real and artificial data, show that the method successfully identifies effective features, performing better than a number of competitive approaches.

التعلم الآلي

Effective Discriminative Feature Selection with Non-trivial Solutions

235 - Hong Tao , Chenping Hou , Feiping Nie 2015

Feature selection and feature transformation, the two main ways to reduce dimensionality, are often presented separately. In this paper, a feature selection method is proposed by combining the popular transformation based dimensionality reduction met hod Linear Discriminant Analysis (LDA) and sparsity regularization. We impose row sparsity on the transformation matrix of LDA through ${ell}_{2,1}$-norm regularization to achieve feature selection, and the resultant formulation optimizes for selecting the most discriminative features and removing the redundant ones simultaneously. The formulation is extended to the ${ell}_{2,p}$-norm regularized case: which is more likely to offer better sparsity when $0<p<1$. Thus the formulation is a better approximation to the feature selection problem. An efficient algorithm is developed to solve the ${ell}_{2,p}$-norm based optimization problem and it is proved that the algorithm converges when $0<ple 2$. Systematical experiments are conducted to understand the work of the proposed method. Promising experimental results on various types of real-world data sets demonstrate the effectiveness of our algorithm.

التعلم الآلي

Non-convex Regularizations for Feature Selection in Ranking With Sparse SVM

272 - Lea Laporte 2015

Feature selection in learning to rank has recently emerged as a crucial issue. Whereas several preprocessing approaches have been proposed, only a few works have been focused on integrating the feature selection into the learning process. In this wor k, we propose a general framework for feature selection in learning to rank using SVM with a sparse regularization term. We investigate both classical convex regularizations such as $ell_1$ or weighted $ell_1$ and non-convex regularization terms such as log penalty, Minimax Concave Penalty (MCP) or $ell_p$ pseudo norm with $ptextless{}1$. Two algorithms are proposed, first an accelerated proximal approach for solving the convex problems, second a reweighted $ell_1$ scheme to address the non-convex regularizations. We conduct intensive experiments on nine datasets from Letor 3.0 and Letor 4.0 corpora. Numerical results show that the use of non-convex regularizations we propose leads to more sparsity in the resulting models while prediction performance is preserved. The number of features is decreased by up to a factor of six compared to the $ell_1$ regularization. In addition, the software is publicly available on the web.

التعلم الآلي

Fairness-Aware Unsupervised Feature Selection

128 - Xiaoying Xing , Hongfu Liu , Chen Chen 2021

Feature selection is a prevalent data preprocessing paradigm for various learning tasks. Due to the expensive cost of acquiring supervision information, unsupervised feature selection sparks great interests recently. However, existing unsupervised fe ature selection algorithms do not have fairness considerations and suffer from a high risk of amplifying discrimination by selecting features that are over associated with protected attributes such as gender, race, and ethnicity. In this paper, we make an initial investigation of the fairness-aware unsupervised feature selection problem and develop a principled framework, which leverages kernel alignment to find a subset of high-quality features that can best preserve the information in the original feature space while being minimally correlated with protected attributes. Specifically, different from the mainstream in-processing debiasing methods, our proposed framework can be regarded as a model-agnostic debiasing strategy that eliminates biases and discrimination before downstream learning algorithms are involved. Experimental results on multiple real-world datasets demonstrate that our framework achieves a good trade-off between utility maximization and fairness promotion.

التعلم الآلي الذكاء الاصطناعي

Feature-Weighted Linear Stacking

150 - Joseph Sill , Gabor Takacs , Lester Mackey 2009

Ensemble methods, such as stacking, are designed to boost predictive accuracy by blending the predictions of multiple machine learning models. Recent work has shown that the use of meta-features, additional inputs describing each example in a dataset , can boost the performance of ensemble methods, but the greatest reported gains have come from nonlinear procedures requiring significant tuning and training time. Here, we present a linear technique, Feature-Weighted Linear Stacking (FWLS), that incorporates meta-features for improved accuracy while retaining the well-known virtues of linear regression regarding speed, stability, and interpretability. FWLS combines model predictions linearly using coefficients that are themselves linear functions of meta-features. This technique was a key facet of the solution of the second place team in the recently concluded Netflix Prize competition. Significant increases in accuracy over standard linear stacking are demonstrated on the Netflix Prize collaborative filtering dataset.

التعلم الآلي الذكاء الاصطناعي