Named Entity Tagged Corpus Augmentation Using Automatic Editing 


Vol. 50,  No. 1, pp. 11-18, Jan.  2023
10.5626/JOK.2023.50.1.11


PDF

  Abstract

A corpus is an essential resource for machine learning and deep learning in the field of natural language processing. In Korean, there are insufficient well-refined named entity corpus compared to advanced research countries such as the United States, Japan, and China. Most projects for building a named entity corpus proceed manually and/or semi-automatically and thus require a lot of cost and effort. In this paper, we propose a novel method for automatically augmenting a small-sized named entity corpus. The proposed method augments the corpus by automatically editing, for example, substituting, inserting, and deleting. We use probabilistic sampling rather than simple editing to make the augmented corpus natural and diverse. Through experiments, we have shown that the performance of Korean named entity recognition can be improved using the augmented corpus and the proposed method should be used in practice.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

J. Kim and J. Kim, "Named Entity Tagged Corpus Augmentation Using Automatic Editing," Journal of KIISE, JOK, vol. 50, no. 1, pp. 11-18, 2023. DOI: 10.5626/JOK.2023.50.1.11.


[ACM Style]

Jae-kyun Kim and Jae-Hoon Kim. 2023. Named Entity Tagged Corpus Augmentation Using Automatic Editing. Journal of KIISE, JOK, 50, 1, (2023), 11-18. DOI: 10.5626/JOK.2023.50.1.11.


[KCI Style]

김재균, 김재훈, "자동 편집을 이용한 개체명 말뭉치 확장," 한국정보과학회 논문지, 제50권, 제1호, 11~18쪽, 2023. DOI: 10.5626/JOK.2023.50.1.11.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr