ترغب بنشر مسار تعليمي؟ اضغط هنا

Multi-Task Regularization with Covariance Dictionary for Linear Classifiers

274   0   0.0 ( 0 )
 نشر من قبل Fanyi Xiao
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In this paper we propose a multi-task linear classifier learning problem called D-SVM (Dictionary SVM). D-SVM uses a dictionary of parameter covariance shared by all tasks to do multi-task knowledge transfer among different tasks. We formally define the learning problem of D-SVM and show two interpretations of this problem, from both the probabilistic and kernel perspectives. From the probabilistic perspective, we show that our learning formulation is actually a MAP estimation on all optimization variables. We also show its equivalence to a multiple kernel learning problem in which one is trying to find a re-weighting kernel for features from a dictionary of basis (despite the fact that only linear classifiers are learned). Finally, we describe an alternative optimization scheme to minimize the objective function and present empirical studies to valid our algorithm.



قيم البحث

اقرأ أيضاً

86 - Jie Gui , Haizhang Zhang 2021
Multi-task learning is an important trend of machine learning in facing the era of artificial intelligence and big data. Despite a large amount of researches on learning rate estimates of various single-task machine learning algorithms, there is litt le parallel work for multi-task learning. We present mathematical analysis on the learning rate estimate of multi-task learning based on the theory of vector-valued reproducing kernel Hilbert spaces and matrix-valued reproducing kernels. For the typical multi-task regularization networks, an explicit learning rate dependent both on the number of sample data and the number of tasks is obtained. It reveals that the generalization ability of multi-task learning algorithms is indeed affected as the number of tasks increases.
Federated multi-task learning (FMTL) has emerged as a natural choice to capture the statistical diversity among the clients in federated learning. To unleash the potential of FMTL beyond statistical diversity, we formulate a new FMTL problem FedU usi ng Laplacian regularization, which can explicitly leverage relationships among the clients for multi-task learning. We first show that FedU provides a unified framework covering a wide range of problems such as conventional federated learning, personalized federated learning, few-shot learning, and stratified model learning. We then propose algorithms including both communication-centralized and decentralized schemes to learn optimal models of FedU. Theoretically, we show that the convergence rates of both FedUs algorithms achieve linear speedup for strongly convex and sublinear speedup of order $1/2$ for nonconvex objectives. While the analysis of FedU is applicable to both strongly convex and nonconvex loss functions, the conventional FMTL algorithm MOCHA, which is based on CoCoA framework, is only applicable to convex case. Experimentally, we verify that FedU outperforms the vanilla FedAvg, MOCHA, as well as pFedMe and Per-FedAvg in personalized federated learning.
The dynamic ensemble selection of classifiers is an effective approach for processing label-imbalanced data classifications. However, such a technique is prone to overfitting, owing to the lack of regularization methods and the dependence of the afor ementioned technique on local geometry. In this study, focusing on binary imbalanced data classification, a novel dynamic ensemble method, namely adaptive ensemble of classifiers with regularization (AER), is proposed, to overcome the stated limitations. The method solves the overfitting problem through implicit regularization. Specifically, it leverages the properties of stochastic gradient descent to obtain the solution with the minimum norm, thereby achieving regularization; furthermore, it interpolates the ensemble weights by exploiting the global geometry of data to further prevent overfitting. According to our theoretical proofs, the seemingly complicated AER paradigm, in addition to its regularization capabilities, can actually reduce the asymptotic time and memory complexities of several other algorithms. We evaluate the proposed AER method on seven benchmark imbalanced datasets from the UCI machine learning repository and one artificially generated GMM-based dataset with five variations. The results show that the proposed algorithm outperforms the major existing algorithms based on multiple metrics in most cases, and two hypothesis tests (McNemars and Wilcoxon tests) verify the statistical significance further. In addition, the proposed method has other preferred properties such as special advantages in dealing with highly imbalanced data, and it pioneers the research on the regularization for dynamic ensemble methods.
A recent technique of randomized smoothing has shown that the worst-case (adversarial) $ell_2$-robustness can be transformed into the average-case Gaussian-robustness by smoothing a classifier, i.e., by considering the averaged prediction over Gaussi an noise. In this paradigm, one should rethink the notion of adversarial robustness in terms of generalization ability of a classifier under noisy observations. We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise. This relationship allows us to design a robust training objective without approximating a non-existing smoothed classifier, e.g., via soft smoothing. Our experiments under various deep neural network architectures and datasets show that the certified $ell_2$-robustness can be dramatically improved with the proposed regularization, even achieving better or comparable results to the state-of-the-art approaches with significantly less training costs and hyperparameters.
Multi-task learning (MTL) is a common paradigm that seeks to improve the generalization performance of task learning by training related tasks simultaneously. However, it is still a challenging problem to search the flexible and accurate architecture that can be shared among multiple tasks. In this paper, we propose a novel deep learning model called Task Adaptive Activation Network (TAAN) that can automatically learn the optimal network architecture for MTL. The main principle of TAAN is to derive flexible activation functions for different tasks from the data with other parameters of the network fully shared. We further propose two functional regularization methods that improve the MTL performance of TAAN. The improved performance of both TAAN and the regularization methods is demonstrated by comprehensive experiments.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا