RAID-P: Redundancy and Intra-disk Parity

Speaker:
Eitan Rosenfeld, M.Sc. Thesis Seminar
Date:
Wednesday, 19.11.2014, 15:00
Place:
Taub 601
Advisor:
Dan Tsafrir, Michael Factor

Contemporary storage systems use redundancy - typically either three- way replication or erasure coding - to reduce the risk of permanent data loss due to simultaneous disk failures. Replication greatly reduces usable disk space, thus increasing costs. Erasure coding adds complexity, is not commonly used for mutable data in a distributed setting, and requires high network bandwidth to recover from a failed device. We propose to alleviate these problems with RAID-P, a storage system that maintains only two replicas and utilizes per-disk ``add-ons'', which are simple independent hardware devices that sit between the host and the storage device. The add-ons store local redundancy data to increase the failure tolerance of the system without using the network. RAID-P significantly reduces the risk of data loss due to temporally adjacent disk failures by quickly copying at-risk data from disks to their add-ons. RAID-P further eliminates the risk entirely by maintaining local parity information of disks on their add-ons (such that each add-on holds the parity of its own disk's data chunks but in an independent failure domain). RAID-P may open the door for cloud providers to reduce the number of data replicas they use from three to two.

Back to the index of events