Characterizing learnability by a combinatorial dimension is a hallmark of machine learning theory. Starting from the fundamental characterization of binary classification PAC by the VC-dimension, through the characterization of online mistake bound learnability by the Littlestone dimension, all the way to recent papers characterizing multi-class learning and differential privacy.
However, for some basic learning setup no such characterizations have been provided.
I will focus on one of these setups — unsupervised learning probability distributions.
I will start by showing that with no prior assumptions density estimation cannot have success guarantees.
Next, I will describe a positive learnability result — settling the question of the sample complexity of learning mixtures of Gaussians.
Finally, I will turn to the question of characterizing learnable classes of distributions.
I will show that there can be no combinatorial characterization of the family of learnable classes of distributions, as well as similar impossibility results for other learnability setups (some of these going back to work of Giora Benedek and Alon Itai). This settles open problems that have been open for more than 30 years.
The first part is based on joint work with Hasan Ashiani, Nick Harvey, Chris Law, Abas Merhabian and Yaniv Plan and the second part on work with my student Tosca Lechner.