Efficient Memory Management Techniques for LLM Inference in Mobile System 


Vol. 52,  No. 8, pp. 637-643, Aug.  2025
10.5626/JOK.2025.52.8.637


PDF

  Abstract

On-device LLMs have gained increased attention due to privacy and network latency issues associated with cloud-based LLMs. However, the memory management policies in mobile operating systems have limitations in efficiently handling memory resources during LLM inference. In this paper, we propose two techniques, Initial KV Cache Swap and Deferred Weight Reclamation, which leverage zRAM for preallocated KV cache and reduce storage I/O by deferring weight eviction, leading to enhanced LLM inference performance. Our proposed approach achieves up to a 27% reduction in memory usage compared to the default Linux kernel, optimizing LLM inference performance in memory-constrained mobile environments. Moreover, our approach yields greater memory savings as the number of candidate paths increases in inference techniques such as speculative decoding, demonstrating its effectiveness in supporting diverse LLM decoding techniques on mobile devices.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

H. Shim, S. Ko, W. Doh, J. H. Ahn, "Efficient Memory Management Techniques for LLM Inference in Mobile System," Journal of KIISE, JOK, vol. 52, no. 8, pp. 637-643, 2025. DOI: 10.5626/JOK.2025.52.8.637.


[ACM Style]

Hyunjeong Shim, Seoyoung Ko, Wanju Doh, and Jung Ho Ahn. 2025. Efficient Memory Management Techniques for LLM Inference in Mobile System. Journal of KIISE, JOK, 52, 8, (2025), 637-643. DOI: 10.5626/JOK.2025.52.8.637.


[KCI Style]

심현정, 고서영, 도완주, 안정호, "모바일 환경에서의 효과적인 LLM 추론을 위한 메모리 관리 기법 연구," 한국정보과학회 논문지, 제52권, 제8호, 637~643쪽, 2025. DOI: 10.5626/JOK.2025.52.8.637.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr