Beyond SGD: Data Adaptive Methods for Machine Learning

Kfir Levy - CS-Lecture
Thursday, 13.12.2018, 10:30
Room 337 Taub Bld.
Institute for Machine Learning at ETH Zurich

The tremendous success of the Machine Learning paradigm heavily relies on the development of powerful optimization methods. The canonical algorithm for training learning models is SGD (Stochastic Gradient Descent), yet this method has its limitations. It is often unable to exploit useful statistical/geometric structure, it might degrade upon encountering prevalent non-convex phenomena, and it is hard to parallelize. In this talk I will discuss an ongoing line of research where we develop alternative methods that resolve some of SGD's limitations. The methods that I describe are as efficient as SGD, and implicitly adapt to the underlying structure of the problem in a data dependent manner. In the first part of the talk, I will discuss a method that is able to take advantage of hard/easy training samples. In the second part, I will discuss a method that enables an efficient parallelization of SGD. Finally, I will briefly describe a method that implicitly adapts to the smoothness and noise properties of the learning objective. Bio: ====== Kfir Levy is a post-doctoral fellow in the Institute for Machine Learning at ETH Zurich, advised by Prof. Andreas Krause. Kfir's research is focused on Machine Learning and Stochastic Optimization, with a special interest in designing universal methods that apply to a wide class of learning scenarios. He is a recipient of the ETH Zurich Postdoctoral fellowship, as well as the Irwin and Joan Jacobs fellowship for excellence in research. Kfir received his degrees from the Technion- Israel Institute of Technology. He was advised by Prof. Elad Hazan during his Ph.D. and by Prof. Nahum Shimkin during his Master's.

Back to the index of events