Evaluating of Korean Machine Reading Comprehension Generalization Performance via Cross-, Blind and Open-Domain QA Dataset Assessment 


Vol. 48,  No. 3, pp. 275-283, Mar.  2021
10.5626/JOK.2021.48.3.275


PDF

  Abstract

Machine reading comprehension (MRC) entails identification of the correct answer in a paragraph when a natural language question and paragraph are provided. Recently, fine-tuning based on a pre-trained language model yields the best performance. In this study, we evaluated the ability of machine-reading comprehension method to generalize question and paragraph pairs, rather than similar training sets. Towards this end, the cross-evaluation between datasets and blind evaluation was performed. The results showed a correlation between generalization performance and datasets such as answer length and overlap ratio between question and paragraph. As a result of blind evaluation, the evaluation dataset with the long answer and low lexical overlap between the questions and paragraphs resulted in less than 80% performance. Finally, the generalized performance of the MRC model under the open domain QA environment was evaluated, and the performance of the MRC using the searched paragraph was found to be degraded. According to the MRC task characteristics, the difficulty and differences in generalization performance depend on the relationship between the question and the answer, suggesting the need for analysis of different evaluation sets.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

J. Lim and H. Kim, "Evaluating of Korean Machine Reading Comprehension Generalization Performance via Cross-, Blind and Open-Domain QA Dataset Assessment," Journal of KIISE, JOK, vol. 48, no. 3, pp. 275-283, 2021. DOI: 10.5626/JOK.2021.48.3.275.


[ACM Style]

Joon-Ho Lim and Hyun-ki Kim. 2021. Evaluating of Korean Machine Reading Comprehension Generalization Performance via Cross-, Blind and Open-Domain QA Dataset Assessment. Journal of KIISE, JOK, 48, 3, (2021), 275-283. DOI: 10.5626/JOK.2021.48.3.275.


[KCI Style]

임준호, 김현기, "기계독해 말뭉치의 교차 평가, 블라인드 평가 및 오픈도메인 질의응답 환경 평가를 통한 한국어 기계독해의 일반화 성능 평가," 한국정보과학회 논문지, 제48권, 제3호, 275~283쪽, 2021. DOI: 10.5626/JOK.2021.48.3.275.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr