The Taub Faculty of Computer Science Events and Talks
Niv Giladi (Ph.D. Thesis Seminar)
Tuesday, 04.07.2023, 13:30
Advisor: Prof. Daniel Soudry
Deep Neural Networks (DNNs) training continues to scale over size and computational footprint, as a result of a higher number of trainable parameters, wider and deeper models, and growing amounts of training data. As improvements in model quality lead over hardware capabilities, this scale-up translates into a need for a growing number of training devices working in tandem, turning distributed training into the standard approach for training DNNs on a large scale. This seminar delves into distributed training, exploring the current solutions and implications for improving scalability. First, we will examine asynchronous training from a dynamical stability perspective, and derive optimal hyperparameters tuning rules. Then, we will look into scalability challenges in synchronous training and suggest a method to improve its robustness. Finally, we will introduce a paradigm of integrating deep learning with physics simulations to improve the scalability of the latter, leading to x4096 theoretical acceleration in physics simulation.