Stochastic optimization for learning over networks
Stochastic optimization methods, e.g., stochastic gradient descent (SGD), have recently found wide applications in large-scale data analysis, especially in machine learning. These methods are very attractive to process online streaming data as they scan through the dataset only once but still generate solutions with acceptable accuracy. However, it is known that classical SGDs are ineffective in processing streaming data distributed over multi-agent network systems (e.g., sensor and social networks), mainly due to the high communication costs incurred by these methods.
In this talk, we present a few new classes of SGDs which can significantly reduce the aforementioned communication costs for distributed or decentralized machine learning. We show that these methods can significantly save inter-node communications when performing SGD iterations. Meanwhile, the total number of stochastic (sub)gradient computations required by these methods are comparable to those optimal ones achieved by classical centralized SGD type methods.
This talk is based on the following two papers.
1. G. Lan and Y. Zhou, Random gradient extrapolation for distributed and stochastic optimization, SIAM Journal on Optimization, 28(4), 2753-2782, 2018.
2. G. Lan, S. Lee and Y. Zhou, Communication-efficient Algorithms for Decentralized and Stochastic Optimization, Mathematical Programming, to appear, 2018.
December 17th, 2018
14:00 ~ 15:00
Guanghui (George) Lan, H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology
Room 102, School of Information Management & Engineering, Shanghai University of Finance & Economics