אירועים
אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב
יום רביעי, 31.01.2024, 10:30
When training large neural networks, there are typically many solutions that perfectly fit the training data. Nevertheless, gradient-based methods often have a tendency to reach those which generalize well, namely, perform well also on test data. Thus, the training algorithm seems to be implicitly biased towards certain networks, which exhibit good generalization performance. Understanding this “implicit bias” has been a subject of extensive research recently. Moreover, in contradiction to conventional wisdom in machine learning theory, trained networks often generalize well even when perfectly fitting noisy training data (i.e., data with label noise), a phenomenon called “benign overfitting”.
In this talk, I will discuss the above phenomena. In the first part of the talk, I will discuss the implicit bias and its implications. I will show how the implicit bias can lead to good generalization performance, but can also have negative implications in the context of susceptibility to adversarial examples and privacy attacks. In the second part of the talk, I will explore benign overfitting and the settings in which it occurs in neural networks.
Short bio:
Gal is a postdoctoral researcher at TTI-Chicago and the Hebrew University, hosted by Nati Srebro and Amit Daniely as part of the NSF/Simons Collaboration on the Theoretical Foundations of Deep Learning. Prior to that, he was a postdoc at the Weizmann Institute, hosted by Ohad Shamir, and a PhD student at the Hebrew University, advised by Orna Kupferman. His research focuses on theoretical machine learning, with an emphasis on deep-learning theory.