Events

The Taub Faculty of Computer Science Events and Talks

Information and Randomness in Online Learning

Idan Mehalel (Ph.D. Thesis Seminar)

Sunday, 13.10.2024, 10:30

Taub 9

Advisor: Prof. Yuval Filmus and Dr. Shay Moran

Suppose that n forecasting experts are providing daily rain/no-rain predictions, and the best among them is mistaken in at most k many days. For how many days will an optimal learner allowed to observe the predictions mis-predict? This is a fundamental problem in online learning, and other important classification problems can be reduced to it. It was studied in the 90’s by Cesa-Bianchi, Freund, Helmbold, and Warmuth who gave fine-grained bounds for the case where the learner may not use randomness. We study this problem without this restriction, and provide fine-grained bounds.
There are many generalizations of this problem where the learner only receives partial feedback on its predictions. One suggested model is “apple tasting feedback”, in which the learner only knows the true outcome when it predicts a rainy day. Another very common model is “bandit feedback”, in which there are L > 2 possible outcomes, and the learner only receives “correct” or “incorrect” feedback on its predictions. We study these models in various settings and provide nearly-tight mistake bounds.
In this talk, we will discuss some of the techniques used to prove those mistake bounds. The mistake bound for the most classical full-information problem is proved via a reduction of the problem to calculating the (average) depth of binary trees, by using novel complexity measures of the set of experts. The bounds for the bandit feedback problem are proved by using minimax duality to obtain a dual variation of the problem, which is easier to handle.

[Back to the index of events]