Skip to content (access key 's')
Logo of Technion
Logo of CS Department
Logo of CS4People
Events

The Taub Faculty of Computer Science Events and Talks

Fault-Tolerant Operating System for Many-Core Processors
event speaker icon
Amit Fuchs (M.Sc. Thesis Seminar)
event date icon
Wednesday, 06.12.2017, 10:30
event location icon
Taub 601
event speaker icon
Advisor: Prof. A. Mendelson
This seminar presents a fault-tolerant distributed operating system designed to harness the massive parallelism in many-core (1,000-10,000+) distributed shared memory processors. In order to scale efficiently and reliably as cores count rapidly increase while their reliability decrease, the new operating system provides fault-tolerant task-level parallelism using coarse-grained data-flow principles. Combining message passing and shared memory, a wait-free decentralized execution engine was created that allows applications to implicitly utilize all cores of future exascale systems-on-chip. The system allows programs to remain oblivious to faults without requiring explicit synchronization or strong consistency guarantees over the shared memory. A prototype implementation of the new operating system was experimentally evaluated on a many-core full-system simulator, the presented results exemplify the characteristics and benefits of the new design.