|Title||An I/O Bandwidth-Sensitive Sparse Matrix-Vector Multiplication Engine on FPGAs|
|Publication Type||Journal Articles|
|Authors||Sun, S., M. Monga, P. Jones, and J. Zambreno|
|Journal||IEEE Transactions on Circuits and Systems-I (TCAS-I)|
Sparse Matrix-Vector Multiplication (SMVM) is a fundamental core of many high-performance computing applications, including information retrieval, medical imaging, and economic modeling. While the use of reconﬁgurable computing technology in a high-performance computing environment has shown recent promise in accelerating a wide variety of scientiﬁc applications, existing SMVM architectures on FPGA hardware have been limited in that they require either numerous pipeline stalls during computation (due to zero padding) or excessive input preprocessing during run-time. For large-scale sparse matrix scenarios, both of these shortcomings can result in unacceptable performance overheads, limiting the overall value of using FPGAs in a high-performance computing environment. In this paper, we present a scalable and efﬁcient FPGA-based SMVM architecture which can handle arbitrary matrix sparsity patterns without excessive preprocessing or zero padding and can be dynamically expanded based on the available I/O bandwidth. Our experimental results using a commercial FPGA-based acceleration system demonstrate that our reconﬁgurable SMVM engine is highly efﬁcient, with benchmark-dependent speedups over an optimized software implementation that range from 3.5x to 6.5x in terms of computation time.