Journal of KIISE

Search : [ author: Euhyun Moon ] (1)

As deep neural network training is compute-intensive and takes a very long time, distributed training using clusters with multiple graphics processing units (GPUs) has been widely adopted. The distributed training of deep neural networks is severely slowed due to straggler, i.e., the slowest worker. Hence, previous studies have proposed solutions to the straggler problem. The existing approaches assume that all data samples, such as images, have a constant size, and they do not recognize data imbalance issues, caused by data samples with different sizes, such as videos and audios, while solving the straggler problem. In this paper, we propose a data imbalance minimization (DIM) strategy that considers data imbalance problems to solve the straggler problem caused by imbalanced data samples. Our evaluation on eight NVIDIA Tesla T4 GPUs shows that DIM outperforms the state-of-the-art systems by up to 1.77x speedup with comparable scalability.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Journal of KIISE

Journal of KIISE

Digital Library[ Search Result ]

A Data Imbalance Minimization Strategy for Scalable Deep Learning Training

Search

Editorial Office