Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People
Events

The Taub Faculty of Computer Science Events and Talks

ceClub: Smart Distributed Storage for the Datacenter
event speaker icon
Zsolt Istvan (IMDEA Software Institute in Madrid, Spain)
event date icon
Wednesday, 14.11.2018, 11:30
event location icon
Electrical Eng. Building 1061
There is a widening gap in the datacenter between data growth and stagnating CPU performance. This gap limits our ability to solve more complex problems and prompts us to revisit both the architecture of servers and the ways we manage and process data. In my work, I aim to narrow this gap using specialization and HW/SW co-design. As a specific example, I will talk about building energy-efficient distributed storage for large-scale data processing applications.

Most modern data-intensive applications are designed to run on tens or hundreds of nodes, often splitting them between "compute" and "storage" layers. This separation increases scalability but also introduces data movement bottlenecks. We show how these bottlenecks can be lifted without increasing the nodes' power consumption by pushing down computation into storage nodes built using specialized hardware (FPGAs). Caribou implements data management, data processing, and replication for fault tolerance in a micro-server footprint. It delivers network-bound throughput and, even though it is based on specialized hardware, it can be efficiently shared by multiple tenants.

Caribou is open-source and acts as a platform for exploring ideas related to, on the one hand, distributed data management using specialized hardware and, on the other hand, near-data processing in emerging data science workloads.

Presenter bio:
Zsolt Istvan is an Assistant Research Professor at the IMDEA Software Institute in Madrid, Spain. Before that, he earned his PhD in the Systems Group at ETH Zurich, Switzerland, working with FPGAs and distributed storage. In his research, he explores ideas around specialization as a way of lifting bottlenecks in distributed systems and databases.