ﻻ يوجد ملخص باللغة العربية
By leveraging experience from previous tasks, meta-learning algorithms can achieve effective fast adaptation ability when encountering new tasks. However it is unclear how the generalization property applies to new tasks. Probably approximately correct (PAC) Bayes bound theory provides a theoretical framework to analyze the generalization performance for meta-learning. We derive three novel generalisation error bounds for meta-learning based on PAC-Bayes relative entropy bound. Furthermore, using the empirical risk minimization (ERM) method, a PAC-Bayes bound for meta-learning with data-dependent prior is developed. Experiments illustrate that the proposed three PAC-Bayes bounds for meta-learning guarantee a competitive generalization performance guarantee, and the extended PAC-Bayes bound with data-dependent prior can achieve rapid convergence ability.
We introduce a new and rigorously-formulated PAC-Bayes few-shot meta-learning algorithm that implicitly learns a prior distribution of the model of interest. Our proposed method extends the PAC-Bayes framework from a single task setting to the few-sh
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world p
The dominant term in PAC-Bayes bounds is often the Kullback--Leibler divergence between the posterior and prior. For so-called linear PAC-Bayes risk bounds based on the empirical risk of a fixed posterior kernel, it is possible to minimize the expect
Despite recent advances in its theoretical understanding, there still remains a significant gap in the ability of existing PAC-Bayesian theories on meta-learning to explain performance improvements in the few-shot learning setting, where the number o
Neural Stochastic Differential Equations model a dynamical environment with neural nets assigned to their drift and diffusion terms. The high expressive power of their nonlinearity comes at the expense of instability in the identification of the larg