Accelerating DNN Models via Hierarchical N:M Sparsity

Seungmin Yu; Hayun Lee; Dongkun Shin

Accelerating DNN Models via Hierarchical N:M Sparsity

Seungmin Yu

Hayun Lee

Dongkun Shin

Vol. 51, No. 7, pp. 583-591, Jul. 2024

10.5626/JOK.2024.51.7.583

PDF

Abstract

N:M sparsity pruning is an effective approach for compressing deep neural networks by leveraging NVIDIA’s Sparse Tensor Core technology. Despite its effectiveness, this technique is constrained by hardware limitations, leading to fixed compression ratios and increased access to unnecessary input data, and does not adequately address the imbalanced distribution of essential parameters. This paper proposes Hierarchical N:M (HiNM) sparsity, where vector sparsity is applied prior to N:M sparsity for various-levels of sparsity. We also introduce a novel permutation technique tailored for HiNM sparsity, named 2-axis channel permutation (2CP). The experimental results showed that HiNM sparsity achieves a compression ratio twice that of traditional N:M sparsity while reducing latency by an average of 37%.

Statistics

Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.

Cite this article

[IEEE Style]

S. Yu, H. Lee, D. Shin, "Accelerating DNN Models via Hierarchical N:M Sparsity," Journal of KIISE, JOK, vol. 51, no. 7, pp. 583-591, 2024. DOI: 10.5626/JOK.2024.51.7.583.

[ACM Style]

Seungmin Yu, Hayun Lee, and Dongkun Shin. 2024. Accelerating DNN Models via Hierarchical N:M Sparsity. Journal of KIISE, JOK, 51, 7, (2024), 583-591. DOI: 10.5626/JOK.2024.51.7.583.

[KCI Style]

유승민, 이하윤, 신동군, "Hierarchical N:M Sparsity를 통한 DNN 모델 가속화," 한국정보과학회 논문지, 제51권, 제7호, 583~591쪽, 2024. DOI: 10.5626/JOK.2024.51.7.583.

[Endnote/Zotero/Mendeley (RIS)] Download

[BibTeX] Download

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr