Document-level Machine Translation Data Augmentation Using a Cluster Algorithm and NSP 


Vol. 50,  No. 5, pp. 401-406, May  2023
10.5626/JOK.2023.50.5.401


PDF

  Abstract

In recent years, research on document level machine translation has been actively conducted to understand the context of the entire document and perform natural translation. Similar to the sentence-level machine translation model, a large amount of training data is required for training of the document-level machine translation model, but there is great difficulty in building a large amount of document-level parallel corpus. Therefore, in this paper, we propose a data augmentation technique effective for document-level machine translation in order to improve the lack of parallel corpus per document. As a result of the experiment, by applying the data augmentation technique using the cluster algorithm and NSP to the sentence unit parallel corpus without context, the performance of the document-level machine translation is improved by S-BLEU 3.0 and D-BLEU 2.7 compared to that before application of the data augmentation technique.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

D. Kim and C. Lee, "Document-level Machine Translation Data Augmentation Using a Cluster Algorithm and NSP," Journal of KIISE, JOK, vol. 50, no. 5, pp. 401-406, 2023. DOI: 10.5626/JOK.2023.50.5.401.


[ACM Style]

Dokyoung Kim and Changki Lee. 2023. Document-level Machine Translation Data Augmentation Using a Cluster Algorithm and NSP. Journal of KIISE, JOK, 50, 5, (2023), 401-406. DOI: 10.5626/JOK.2023.50.5.401.


[KCI Style]

김도경, 이창기, "군집 알고리즘과 NSP를 이용한 문서 단위 기계 번역 데이터 증강," 한국정보과학회 논문지, 제50권, 제5호, 401~406쪽, 2023. DOI: 10.5626/JOK.2023.50.5.401.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr