Communications in Mathematical Sciences

Volume 16 (2018)

Number 3

Semigroups of stochastic gradient descent and online principal component analysis: properties and diffusion approximations

Pages: 777 – 789

DOI: http://dx.doi.org/10.4310/CMS.2018.v16.n3.a8

Authors

Yuanyuan Feng (Department of Mathematics, Carnegie Mellon University, Pittsburgh, Pennsylvania, U.S.A.)

Lei Li (Department of Mathematics, Duke University, Durham, North Carolina, U.S.A.)

Jian-Guo Liu (Departments of Mathematics and Physics, Duke University, Durham, North Carolina, U.S.A.)

Abstract

We study the Markov semigroups for two important algorithms from machine learning: stochastic gradient descent (SGD) and online principal component analysis (PCA). We investigate the effects of small jumps on the properties of the semigroups. Properties including regularity preserving, $L^{\infty}$ contraction are discussed. These semigroups are the dual of the semigroups for evolution of probability, while the latter are $L^1$ contracting and positivity preserving. Using these properties, we show that stochastic differential equations (SDEs) in $\mathbb{R}^d$ (on the sphere $\mathbb{S}^{d-1})$ can be used to approximate SGD (online PCA) weakly. These SDEs may be used to provide some insights of the behaviors of these algorithms.

Keywords

semigroup, Markov chain, stochastic gradient descent, online principle component analysis, stochastic differential equations

2010 Mathematics Subject Classification

60J20

Full Text (PDF format)

The work of J.-G Liu is partially supported by KI-Net NSF RNMS11-07444, NSF DMS-1514826, and NSF DMS-1812573. Y. Feng is supported by NSF DMS-1252912.

Received 3 September 2017

Received revised 29 January 2018

Accepted 29 January 2018