ﻻ يوجد ملخص باللغة العربية
The Expectation Maximization (EM) algorithm is of key importance for inference in latent variable models including mixture of regressors and experts, missing observations. This paper introduces a novel EM algorithm, called texttt{SPIDER-EM}, for inference from a training set of size $n$, $n gg 1$. At the core of our algorithm is an estimator of the full conditional expectation in the {sf E}-step, adapted from the stochastic path-integrated differential estimator ({tt SPIDER}) technique. We derive finite-time complexity bounds for smooth non-convex likelihood: we show that for convergence to an $epsilon$-approximate stationary point, the complexity scales as $K_{operatorname{Opt}} (n,epsilon )={cal O}(epsilon^{-1})$ and $K_{operatorname{CE}}( n,epsilon ) = n+ sqrt{n} {cal O}(epsilon^{-1} )$, where $K_{operatorname{Opt}}( n,epsilon )$ and $K_{operatorname{CE}}(n, epsilon )$ are respectively the number of {sf M}-steps and the number of per-sample conditional expectations evaluations. This improves over the state-of-the-art algorithms. Numerical results support our findings.
Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities. A first step towards solving these tasks is the automated discovery of distributed symbol-like representations. In this p
Fast Incremental Expectation Maximization (FIEM) is a version of the EM framework for large datasets. In this paper, we first recast FIEM and other incremental EM type algorithms in the {em Stochastic Approximation within EM} framework. Then, we prov
Common-sense physical reasoning is an essential ingredient for any intelligent agent operating in the real-world. For example, it can be used to simulate the environment, or to infer the state of parts of the world that are currently unobserved. In o
Any clustering algorithm must synchronously learn to model the clusters and allocate data to those clusters in the absence of labels. Mixture model-based methods model clusters with pre-defined statistical distributions and allocate data to those clu
Object detection when provided image-level labels instead of instance-level labels (i.e., bounding boxes) during training is an important problem in computer vision, since large scale image datasets with instance-level labels are extremely costly to