Search : [ author: Yerim Choi ] (1)

A Comparative Study on Server Allocation Optimization Algorithms for Accelerating Parallel Training of Large Language Models

Jinkyu Yim, Yerim Choi, Jinho Lee

http://doi.org/10.5626/JOK.2024.51.9.783

As large-scale language models (LLMs) come to be increasingly utilized in various fields, there is an increasing demand to develop models with higher performance. Significant computational power and memory capacity will be needed to train such models. Therefore, researchers have used 3D parallelization methodology for large-scale language model learning on numerous servers equipped with GPUs. However, 3D parallelization requires frequent large-scale data transfers between servers, which bottlenecks the overall training time. To address this, prior studies have proposed a methodology that identifies non-uniform cluster network conditions in advance and arranges servers and GPUs in an optimized parallel configuration. The existing methods of this type use the classical optimization algorithm SA (Simulated Annealing) for mapping. In this paper, we apply genetic algorithms as well as SAT(satisfiability) algorithms to the problem, and compare and analyze the performance of each algorithm under various experimental environments.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr