ﻻ يوجد ملخص باللغة العربية
In this paper, we establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-Agnostic Meta-Learning (MAML). Our continuous-time limit view of the process eliminates the influence of the manually chosen step size of gradient descent and includes the existing gradient descent training algorithm as a special case that results from a specific discretization. We show that the MAML ODE enjoys a linear convergence rate to an approximate stationary point of the MAML loss function for strongly convex task losses, even when the corresponding MAML loss is non-convex. Moreover, through the analysis of the MAML ODE, we propose a new BI-MAML training algorithm that significantly reduces the computational burden associated with existing MAML training methods. To complement our theoretical findings, we perform empirical experiments to showcase the superiority of our proposed methods with respect to the existing work.
Continuous-time Bayesian Networks (CTBNs) represent a compact yet powerful framework for understanding multivariate time-series data. Given complete data, parameters and structure can be estimated efficiently in closed-form. However, if data is incom
While neural networks are powerful function approximators, they suffer from catastrophic forgetting when the data distribution is not stationary. One particular formalism that studies learning under non-stationary distribution is provided by continua
The inaccessibility of controlled randomized trials due to inherent constraints in many fields of science has been a fundamental issue in causal inference. In this paper, we focus on distinguishing the cause from effect in the bivariate setting under
This paper introduces a new framework for data efficient and versatile learning. Specifically: 1) We develop ML-PIP, a general framework for Meta-Learning approximate Probabilistic Inference for Prediction. ML-PIP extends existing probabilistic inter
Hyperparameter optimization in machine learning (ML) deals with the problem of empirically learning an optimal algorithm configuration from data, usually formulated as a black-box optimization problem. In this work, we propose a zero-shot method to m