ceClub: From Algorithms to Architectures – Why Branch Predictors Are Not Great for Irregular Algorithms

דובר:
עודד גרין (ג'ורג'יה)
תאריך:
יום חמישי, 9.6.2016, 11:30
מקום:
חדר 1061, בניין מאייר, הפקולטה להנדסת חשמל

Irregular algorithms, such as sorting and graph algorithms, are prevalent in computer science applications. Until recently, it was assumed that the irregular algorithms also greatly benefited from the the Out-of-Order Execution engines found in most modern day CPU processors. Out-of-Order execution, using branch predictors, speculates which branch in the execution-flow should be taken and starts that execution prematurely - allowing for improved performance. This approach typically works great for regular algorithms such as matrix multiplication or linear algebra based operations where the computation is not data-driven. However, when the speculation is wrong, there is a hardware overhead called the branch-misprediction penalty, which resets the computation back to the point of speculation. These penalties can be costly, as much as an L1 cache miss.

In my talk, I will go over several simple irregular algorithms, including the Merge Path algorithm developed at the Technion, and I will show that not only do branch predictors not help improve the performance of these algorithms, they can in fact reduce performance due to the branch-misprediction penalty and low branch-prediction accuracy. As a reference point, branch prediction accuracy for regular algorithms is typically 95%-99% whereas for the branch-prediction accuracy for the data dependent branches can be as low as 50%. This led us to develop the branch-avoiding model where we replace all data-dependent (and hard to predict) branches with a set of equivalent instructions that do not use branches. For ARM processors, we can replace the branches and their respective clauses with conditional instructions. Our branch-avoiding implementations can improve performance, reduce branch-mispredictions, and increase instruction level parallelism. Our study covers a wide range of CPU processors, including several different x86 micro-architectures as well the ARM processor.

This talk will be self-contained.

Bio:
Oded Green is a research scientist at Georgia Tech in the School of Computational Science and Engineering, where he also received is PhD. Oded received his MSc in electrical engineering and his BSc from in computer engineering, both from the Technion. Prior to his return to Georgia Tech, Oded spent some time working as the Chief Operating Officer for ArrayFire where he was responsible for managing the companies daily activities. Oded's research primarily focuses on improving the performance and scalability of large-scale data analytics using a wide range of high performance computation platforms. This also includes designing algorithms and data structure for dynamic graph problems. In his spare time Oded is also interested in improving the performance of other parallel algorithms such as those found in the OpenCV library.

בחזרה לאינדקס האירועים