Data Science & Deep Learning: Optimal Euclidean Metric Compression in High Dimensions

דובר:
טל וגנר (MIT)
תאריך:
יום שני, 30.12.2019, 12:30
מקום:
טאוב 301

In the metric compression problem, we are given n points in a metric space, and the goal is to construct a compact representation (sketch)of the points, such that the distance between every pair can be approximate lyre covered from the sketch, up to a small distortion of (1 +/- epsilon). Such sketches are widely used for fast nearest neighbor search in high-dimensional Euclidean spaces. We give a new algorithm for sketching Euclidean metric spaces. It provably achieves the optimal compression bound, improving over the classical dimension reduction theorem of Johnson and Linden Strauss. In particular, while the latter theorem represents each point by log(n)/epsilon^2coordinates (each containing a multi-bit number), we show that log(n)/epsilon^2bits are both sufficient and necessary. Empirically, our algorithm either matches or improves over state-of-the-art heuristics.

Based on joint works with Piotr Indyk and Ilya Razenshteyn.

Tal is a Ph.D. student in the Theory of Computation group at CSAIL, MIT. His advisor is Piotr Indyk. His research is on algorithms for massive datasets. Tal mostly work on efficient compression of graphs and metric spaces, applications to machine learning, and the interplay between theory and practice.

בחזרה לאינדקס האירועים