Authors: Kartik Hegde , Hadi Asghari-Moghaddam , Michael Pellauer , Neal Crago , Aamer Jaleel
Keywords:
Description: Generalized tensor algebra is a prime candidate for acceleration via customized ASICs. Modern tensors feature wide range of data sparsity, with the density non-zero elements ranging from 10-6% to 50%. This paper proposes novel approach accelerate kernels based on principle hierarchical elimination computation in presence sparsity. relies rapidly finding intersections---situations where both operands multiplication are non-zero---enabling new fetching mechanisms and avoiding memory latency overheads associated sparse implemented software. We propose ExTensor accelerator, which builds these ideas handling sparsity into hardware enable better bandwidth utilization compute throughput. evaluate several relative industry libraries (Intel MKL) state-of-the-art compilers (TACO). When normalized, we demonstrate an average speedup 3.4×, 1.3×, 2.8×, 24.9×, 2.7× SpMSpM, SpMM, TTV, TTM, SDDMM respectively over server class CPU.
.The resource attribute category is marked as computer automatic recognition, which may not be accurate. You can try clicking the link to view the resource details.