ﻻ يوجد ملخص باللغة العربية
Diversity or complementarity of experts in ensemble pattern recognition and information processing systems is widely-observed by researchers to be crucial for achieving performance improvement upon fusion. Understanding this link between ensemble diversity and fusion performance is thus an important research question. However, prior works have theoretically characterized ensemble diversity and have linked it with ensemble performance in very restricted settings. We present a generalized ambiguity decomposition (GAD) theorem as a broad framework for answering these questions. The GAD theorem applies to a generic convex ensemble of experts for any arbitrary twice-differentiable loss function. It shows that the ensemble performance approximately decomposes into a difference of the average expert performance and the diversity of the ensemble. It thus provides a theoretical explanation for the empirically-observed benefit of fusing outputs from diverse classifiers and regressors. It also provides a loss function-dependent, ensemble-dependent, and data-dependent definition of diversity. We present extensions of this decomposition to common regression and classification loss functions, and report a simulation-based analysis of the diversity term and the accuracy of the decomposition. We finally present experiments on standard pattern recognition data sets which indicate the accuracy of the decomposition for real-world classification and regression problems.
In this paper we propose a generalization of deep neural networks called deep function machines (DFMs). DFMs act on vector spaces of arbitrary (possibly infinite) dimension and we show that a family of DFMs are invariant to the dimension of input dat
In this paper we study the training dynamics for gradient flow on over-parametrized tensor decomposition problems. Empirically, such training process often first fits larger components and then discovers smaller components, which is similar to a tens
Classical learning theory suggests that the optimal generalization performance of a machine learning model should occur at an intermediate model complexity, with simpler models exhibiting high bias and more complex models exhibiting high variance of
Modern biomedical studies often collect multiple types of high-dimensional data on a common set of objects. A popular model for the joint analysis of multi-type datasets decomposes each data matrix into a low-rank common-variation matrix generated by
Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling ensembles into