Tal Wagner (Microsoft Research Redmond)
Room 012 Taub Bld (Learning Center Auditorium)
Recently, there has been a growing interest in harnessing the power of big datasets and modern machine learning for designing new scalable algorithms. This invites us to rethink the role of data in algorithm design: not just as the input to pre-designed algorithms, but also a factor that enters the algorithm design process itself, driving it in a strong and possibly automated manner. This talk will show how to leverage data and learning for better algorithm design in two fundamental areas: high-dimensional similarity search and efficient linear algebra.
In particular, I will show the following:
1. Using data-dependent compression, we obtain optimal compressed representations of high-dimensional Euclidean distances. This result marks the first improvement over classical data-oblivious compression schemes, which provably cannot match its performance.
2. Using neural networks, we show how to learn to construct better data structures for high-dimensional similarity search.
3. We then show how those techniques also give rise to fast algorithms for low rank approximation of matrices.
Our algorithms are both proven analytically and implemented and validated empirically, showing improvements over previous methods on standard benchmark datasets.