作者:li Eta
链接:http://www.zhihu.com/question/28025036/answer/107297334
来源:知乎
著作权归作者所有,转载请联系作者获得授权。
online算法挺多的。
比如基于gradient descent的online算法,举例如下(不止这么多)
1. Online gradient descent: Logarithmic Regret Algorithms for Online Convex Optimization
2. Dual averaging: Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
2. FTRL: A Unified View of Regularized Dual Averaging
3. Adagrad: Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
这方面有很多论文,还有一本挺不错的综述:Online Learning and Online Convex Optimization http://www.cs.huji.ac.il/~shais/papers/OLsurvey.pdf
比如基于multi-armed bandit问题的online算法
1. Finite-time Analysis of the Multiarmed Bandit Problem
multi-armed bandit的相关paper实在太多了,这里就列出早期经典的。也有一本对应的综述:Regret Analysis of Stochastic and
Nonstochastic Multi-armed
Bandit Problems http://homes.di.unimi.it/~cesabian/Pubblicazioni/banditSurvey.pdf
以上两类是追求regret bound的算法。也有一些称作的online的算法只是取义于『随时间变化而更新』,而不证明regret bound。