Abstract
Expectation-Maximization (EM) is a popular tool for learning latent variable models, but the vanilla batch EM does not scale to large data sets because the whole data set is needed at every E-step. Stochastic Expectation Maximization (sEM) reduces the cost of E-step by stochastic approximation. However, sEM has a slower asymptotic convergence rate than batch EM, and requires a decreasing sequence of step sizes, which is difficult to tune. In this paper, we propose a variance reduced stochastic EM (sEM-vr) algorithm inspired by variance reduced stochastic gradient descent algorithms. We show that sEM-vr has the same exponential asymptotic convergence rate as batch EM. Moreover, sEM-vr only requires a constant step size to achieve this rate, which alleviates the burden of parameter tuning. We compare sEM-vr with batch EM, sEM and other algorithms on Gaussian mixture models and probabilistic latent semantic analysis, and sEM-vr converges significantly faster than these baselines.
| Original language | English |
|---|---|
| Pages (from-to) | 7967-7977 |
| Number of pages | 11 |
| Journal | Advances in Neural Information Processing Systems |
| Volume | 2018-December |
| Publication status | Published - 2018 |
| Externally published | Yes |
| Event | 32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada Duration: 2 Dec 2018 → 8 Dec 2018 |
Bibliographical note
Publisher Copyright:© 2018 Curran Associates Inc.All rights reserved.