Search : [ author: 설재욱 ] (1)

KorSciQA 2.0: Question Answering Dataset for Machine Reading Comprehension of Korean Papers in Science & Technology Domain

Hyesoo Kong, Hwamook Yoon, Mihwan Hyun, Hyejin Lee, Jaewook Seol

http://doi.org/10.5626/JOK.2022.49.9.686

Recently, the performance of the Machine Reading Comprehension(MRC) system has been increased through various open-ended Question Answering(QA) task, and challenging QA task which has to comprehensively understand multiple text paragraphs and make discrete inferences is being released to train more intelligent MRC systems. However, due to the absence of a QA dataset for complex reasoning to understand academic information in Korean, MRC research on academic papers has been limited. In this paper, we constructed a QA dataset, KorSciQA 2.0, for the full text including abstracts of Korean academic papers and divided the difficulty level into general, easy, and hard for discriminative MRC systems. A methodology, process, and system for constructing KorSciQA 2.0 were proposed. We conducted MRC performance evaluation experiments and when fine-tuning based on the KorSciBERT model, which is a Korean-based BERT model for science and technology domains, the F1 score was 80.76%, showing the highest performance.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr