Search : [ author: Sungryeol Kim ] (1)

Epoch Score: Dataset Verification using Quantitative Data Quality Assessment

Sungryeol Kim, Taewook Hwang, Sangkeun Jung, Yoonhyung Roh

http://doi.org/10.5626/JOK.2023.50.3.250

It is tough to determine whether a dataset is suitable for a model or specified field or whether there is an error. In this paper, we propose an Epoch Score that indicates the degree of difficulty of the data as a score using incorrect answer data obtained through learning several times under the same conditions but different seeds. Through this, we verified KLUE"s Topic Classification dataset, and about 0.8% performance improvement derived by correcting high-scoring data, which we judge to have errors. Epoch Score can be used for all supervised learning regardless of the data type, such as natural language or images, and the performance of the model can be inferred by the area the of the Epoch Score.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr