Digital Library[ Search Result ]
A Study on Development of Technology to Improve Imbalanced Data Problems in Numerical Dataset Using Tomek Links Method combined with Balancing GAN
Hyunsik Na, Sohee Park, Daeseon Choi
http://doi.org/10.5626/JOK.2020.47.10.974
Machine Learning is useful due to its good performance and application in various fields such as data classification, voice recognition and predictive models. However, there exists a problem regarding the imbalance between classes in the training dataset, which degrades the classification performance of the minority class. In this paper, we propose a new data augmentation method that combines the Balancing GAN and Tomek Links Method to solve the Imbalanced Data problem and find a clear decision boundary. To verity the proposed method, we have evaluated the performance according to the classification model using five datasets. Moreover, the performance has been compared with Data Sampling and GAN based Data Augmentation Techniques. The results showed that the classification performance was improved or maintained by 0.05~0.195 in 17 of the total 25 performance evaluations. The method proposed in this paper showed the potential as a new method to solve the Imbalanced Data problem.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr