Poster + Paper
4 October 2023 Accelerating sparse convolutional neural networks with systolic arrays on FPGA
Hemkant Nehete, Gaurav Verma, Shailendra Yadav, Brajesh Kumar Kaushik
Author Affiliations +
Conference Poster
Abstract
Convolutional Neural Networks (CNNs) are frequently used in a wide range of applications, including speech, image recognition and natural language processing. However, due to the computational complexity of CNNs, deploying these networks on resource-limited edge devices has become a significant challenge. Sparse CNNs use the sparsity in the weight matrices of the networks to minimize computations while maintaining accuracy. By storing only the nonzero values, the Compressed Sparse Row (CSR) format compresses the sparse matrix, lowering the memory requirement and computational complexity of the network. This work presents a novel approach for accelerating Sparse CNNs on Field-Programmable Gate Arrays (FPGAs) using the CSR format and systolic arrays. The proposed method takes advantage of systolic arrays' parallel processing capabilities to perform CSR-based sparse convolutions. Furthermore, an algorithm has been presented that optimizes the data layout to maximize data reuse and minimize data movement between different processing elements of the systolic array and external memory. The architecture is evaluated and compared to a state-of-the-art GPU implementation on several benchmark datasets. The proposed architecture outperformed the GPU-based implementation in terms of throughput and power efficiency by 1.42x and 22.4x, respectively. The presented approach provides a promising solution for accelerating Sparse CNNs on resource-constrained devices and enabling the deployment of these networks in a variety of applications.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Hemkant Nehete, Gaurav Verma, Shailendra Yadav, and Brajesh Kumar Kaushik "Accelerating sparse convolutional neural networks with systolic arrays on FPGA", Proc. SPIE 12675, Applications of Machine Learning 2023, 126750Y (4 October 2023); https://doi.org/10.1117/12.2676783
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Matrices

Field programmable gate arrays

Computer hardware

Convolutional neural networks

Computer architecture

Convolution

Quantization

Back to Top