KorSciQA 2.0: Question Answering Dataset for Machine Reading Comprehension of Korean Papers in Science & Technology Domain 


Vol. 49,  No. 9, pp. 686-695, Sep.  2022
10.5626/JOK.2022.49.9.686


PDF

  Abstract

Recently, the performance of the Machine Reading Comprehension(MRC) system has been increased through various open-ended Question Answering(QA) task, and challenging QA task which has to comprehensively understand multiple text paragraphs and make discrete inferences is being released to train more intelligent MRC systems. However, due to the absence of a QA dataset for complex reasoning to understand academic information in Korean, MRC research on academic papers has been limited. In this paper, we constructed a QA dataset, KorSciQA 2.0, for the full text including abstracts of Korean academic papers and divided the difficulty level into general, easy, and hard for discriminative MRC systems. A methodology, process, and system for constructing KorSciQA 2.0 were proposed. We conducted MRC performance evaluation experiments and when fine-tuning based on the KorSciBERT model, which is a Korean-based BERT model for science and technology domains, the F1 score was 80.76%, showing the highest performance.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

H. Kong, H. Yoon, M. Hyun, H. Lee, J. Seol, "KorSciQA 2.0: Question Answering Dataset for Machine Reading Comprehension of Korean Papers in Science & Technology Domain," Journal of KIISE, JOK, vol. 49, no. 9, pp. 686-695, 2022. DOI: 10.5626/JOK.2022.49.9.686.


[ACM Style]

Hyesoo Kong, Hwamook Yoon, Mihwan Hyun, Hyejin Lee, and Jaewook Seol. 2022. KorSciQA 2.0: Question Answering Dataset for Machine Reading Comprehension of Korean Papers in Science & Technology Domain. Journal of KIISE, JOK, 49, 9, (2022), 686-695. DOI: 10.5626/JOK.2022.49.9.686.


[KCI Style]

공혜수, 윤화묵, 현미환, 이혜진, 설재욱, "KorSciQA 2.0: 과학기술 분야 한국어 논문 기계독해를 위한 질의응답 데이터셋," 한국정보과학회 논문지, 제49권, 제9호, 686~695쪽, 2022. DOI: 10.5626/JOK.2022.49.9.686.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr