ceClub: Making Triangle Counting Fast - Introducing Logarithmic Radix Binning & Vectorized Tri-Counting

Oded Green (Georgia Institute of Technology)
Wednesday, 13.6.2018, 11:30
EE Meyer Building 861

Triangle counting is a scalable analytic that benefits from a large number of processors. Similar to many other graph analytics, the control flow for each thread of triangle counting algorithms is determined by the input graph. This data dependency makes both load-balancing and compiler optimization much more challenging. Vectorization of such codes is even more complex. The recent branch-avoiding model shows how to remove such control flow restrictions by replacing branches with an equivalent set of arithmetic operations. In many cases the cost for implementations enabling vectorization increases the number of operations. In my talk, I will present two main contributions: 1) Logarithmic Radix Binning (LRB) and 2) a vectorized triangle counting algorithm. LRB is responsible for two levels of load-balancing: the thread level and the vector level such that each data lane executes a near equal amount of work. We implemented and evaluated our algorithm on several different platforms including Intel's AVX-512 (testing on KNL) and for NVIDIA GPUs. Our new algorithm outperforms several recent highly optimized triangle counting algorithms by anywhere from 1.5X and up to 11X. We outperform several champion algorithms from the 2017 HPEC Graph Challenge such as the KOKKOS framework and GPU based algorithms. Further, we show near linear speedups for several different versions of our algorithms, each of which uses a different set of optimizations.

Oded Green is a currently a research scientist at the Georgia Institute of Technology (Georgia Tech) in Computational Sciences and Engineering, where he also received his PhD. Oded will join NVIDIA's AI infrastructure group later this summer. Oded received both his MSc in electrical engineering and his BSc in computer engineering from Technion – Israel Institute of Technology. Oded's research primarily focuses on improving the performance and scalability of large-scale graph analytic, with an emphasis on dynamic graph, using a wide range of high-performance computational platforms. Oded also focuses on architecture-algorithm codesign.

Back to the index of events