Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People

The Taub Faculty of Computer Science Events and Talks

ceClub:Dynamic Reconfiguration of Primary/Backup Clusters
event speaker icon
Alex Shraer (Yahoo! Research)
event date icon
Wednesday, 16.05.2012, 11:30
event location icon
Room 337-8 Taub Bld.
Dynamically changing (reconfiguring) the membership of a replicated distributed system while preserving data consistency and system availability is a challenging problem. In this talk I will discuss this problem in the context of Primary/Backup clusters and Apache Zookeeper. Zookeeper is an open source system which enables highly reliable distributed coordination. It is widely used in industry, for example in Yahoo!, Facebook,Twitter, VMWare, Box, Cloudera, Mapr, UBS, Goldman Sachs, Nicira, Netflix and many others. A common use-case of Zookeeper is to dynamically maintain membership and other configuration metadata for its users. Zookeeper itself is a replicated distributed system. Unfortunately, the membership and all other configuration parameters of Zookeeper are static - they're loaded during boot and cannot be altered. Operators resort to ''rolling restart'' - a manually intensive and error-prone method of changing the configuration that has caused data loss and inconsistency in production. Automatic reconfiguration functionality has been requested by operators since 2008. Several previous proposals were found incorrect and rejected. We designed and implemented a new reconfiguration protocol in Zookeeper and are currently integrating it into the codebase. It fully automates configuration changes: the set of Zookeeper servers, their roles, addresses, etc. can be changed dynamically, without service interruption and while maintaining data consistency. By leveraging the properties already provided by Zookeeper our protocol is considerably simpler than state of the art in reconfiguration protocols. Our protocol also encompasses the clients -- clients are rebalanced across servers in the new configuration, while keeping the extent of migration proportional to the change in membership.

This is a joint work with Benjamin Reed (Yahoo!), Dahlia Malkhi (MSR) and Flavio Junqueira (Yahoo!). A paper describing this work will appear in the 2012 Usenix ATC conference and in the 2012 Hadoop Summit.