ﻻ يوجد ملخص باللغة العربية
Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Since a large number of classifiers are available, one natural question is which type of classifiers should be used given a particular classification task. We aim to answering this question by investigating the asymptotic performance of a family of large-margin classifiers in situations where the data dimension $p$ and the sample $n$ are both large. This family covers a broad range of classifiers including support vector machine, distance weighted discrimination, penalized logistic regression, and large-margin unified machine as special cases. The asymptotic results are described by a set of nonlinear equations and we observe a close match of them with Monte Carlo simulation on finite data samples. Our analytical studies shed new light on how to select the best classifier among various classification methods as well as on how to choose the optimal tuning parameters for a given method.
Large-margin classifiers are popular methods for classification. We derive the asymptotic expression for the generalization error of a family of large-margin classifiers in the limit of both sample size $n$ and dimension $p$ going to $infty$ with fix
Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage
The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, whi
We study two practically important cases of model based clustering using Gaussian Mixture Models: (1) when there is misspecification and (2) on high dimensional data, in the light of recent advances in Gradient Descent (GD) based optimization using A
Modern large-scale statistical models require to estimate thousands to millions of parameters. This is often accomplished by iterative algorithms such as gradient descent, projected gradient descent or their accelerat