ﻻ يوجد ملخص باللغة العربية
Data annotation using visual inspection (supervision) of each training sample can be laborious. Interactive solutions alleviate this by helping experts propagate labels from a few supervised samples to unlabeled ones based solely on the visual analysis of their feature space projection (with no further sample supervision). We present a semi-automatic data annotation approach based on suitable feature space projection and semi-supervised label estimation. We validate our method on the popular MNIST dataset and on images of human intestinal parasites with and without fecal impurities, a large and diverse dataset that makes classification very hard. We evaluate two approaches for semi-supervised learning from the latent and projection spaces, to choose the one that best reduces user annotation effort and also increases classification accuracy on unseen data. Our results demonstrate the added-value of visual analytics tools that combine complementary abilities of humans and machines for more effective machine learning.
Feature crossing captures interactions among categorical features and is useful to enhance learning from tabular data in real-world businesses. In this paper, we present AutoCross, an automatic feature crossing tool provided by 4Paradigm to its custo
In this paper, we propose a new wrapper feature selection approach with partially labeled training examples where unlabeled observations are pseudo-labeled using the predictions of an initial classifier trained on the labeled training set. The wrappe
Feature selection is a core area of data mining with a recent innovation of graph-driven unsupervised feature selection for linked data. In this setting we have a dataset $mathbf{Y}$ consisting of $n$ instances each with $m$ features and a correspond
Deep learning models are shown to be vulnerable to adversarial examples. Though adversarial training can enhance model robustness, typical approaches are computationally expensive. Recent works proposed to transfer the robustness to adversarial attac
Data augmentation is usually used by supervised learning approaches for offline writer identification, but such approaches require extra training data and potentially lead to overfitting errors. In this study, a semi-supervised feature learning pipel