The True Difference Between Emulation and Paravirtualization of High-Throughput I/O Devices

Speaker:
Arthur Kiyanovski, M.Sc. Thesis Seminar
Date:
Wednesday, 28.6.2017, 13:00
Place:
Taub 701
Advisor:
Prof. Dan Tsafrir

Machine virtualization has grown in popularity in recent years with the growth of cloud computing. Virtual machines use virtual I/O devices to perform their I/O. Nowadays paravirtual I/O devices are the most popular type of virtual I/O devices due to their high performance and interposition capabilities. However paravirtual I/O devices also have disadvantages. Users need to install device drivers for paravirtual devices whenever they switch hypervisors, and hypervisor providers need to implement device drivers for all operating systems. Emulated I/O devices also allow interposition, and do not have the disadvantages of paravirtual I/O devices as they are designed to work with the device drivers of the physical devices they emulate. These device drivers come preinstalled in all major operating systems, which makes the task of switching hypervisors much easier for the users. And since the device drivers have already been written for the physical devices being emulated, hypervisor providers need not implement device drivers for emulated devices. However emulated I/O devices achieve substantially lower performance than paravirtual ones, which makes them unusable in many real world scenarios. Previous works state that the main reason for the performance difference between paravirtual and emulated I/O devices is the larger amount of exits caused by the latter. To test this claim we created a model that estimates the maximum possible throughput that can be achieved by QEMU’s emulated e1000 NIC, when taking the throughput of the paravirtual virtio-net NIC and adding the overhead of e1000’s extra exits. This model predicts a throughput difference of only 1.13X in favor of virtio-net, which is very different from the 20X throughput difference achieved in practice. This result led us to search for reasons other than exits that could explain this difference. In this work we present differences between QEMU’s virtio-net and e1000 other than exits, which we found contributing to the throughput gap between the two. For each difference we propose an improvement to e1000, inspired by virtio-net’s implementation. We then use the sidecore paradigm to reduce part of the exits caused by e1000 to further improve the throughput of e1000. We were able to reduce the throughput gap between vritio-net and e1000 down to 1.2X when the guest runs on a single core and to 1.25X on a dual core with our sidecore. Our results show that emulated I/O devices can achieve performace that is close to that of paravirtual ones, which might make emulated devices the better choice when flexibilty is more important than best performance.

Back to the index of events