Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People
Events

The Taub Faculty of Computer Science Events and Talks

Communication-efficient Algorithms for Distributed Stream Mining
event speaker icon
Moshe Gabel (Ph.D. Thesis Seminar)
event date icon
Wednesday, 10.05.2017, 13:00
event location icon
Taub 601
event speaker icon
Advisor: Prof. A. Schuster, Prof. D. Keren
Recent years has seen an explosion in the number of connected devices, which means not only growth in velocity and volume of data, but also that data sources are increasingly geographically distributed, raising cost of communication. Data mining algorithms often assume that data is centralized or that communication is inexpensive: the setting is implicitly assumed to be a data center. In settings like wireless sensor networks, however, communication costs battery power. Moreover, most work only considers one-shot computation: computing a result once from a fixed data set. Yet data is increasingly dynamic, and many applications need current results over a recent time window. In this talk, we focus on computing approximations over aggregated distributed data streams with reduced communication. Using a safe zone framework developed in our group (also called geometric monitoring), we'll describe three novel distributed approximations for important non-linear functions: variance, least-squares regression, and Shannon's entropy. Our algorithms provide deterministic user-defined error bounds, while avoiding messages unless needed to maintain those bounds. Compared to the centralized solution, our algorithms reduce communication by up to two orders of magnitude on several real data sets, including machine health monitoring, network monitoring with netflows, traffic monitoring, and others.