Search : [ author: Chansong Jo ] (1)

Number Normalization in Korean Using the Transformer Model

Jaeyoon Chun, Chansong Jo, Jeongpil Lee, Myoung-Wan Koo

http://doi.org/10.5626/JOK.2021.48.5.510

Text normalization is a significant component of text-to-speech (TTS) systems. Since numbers in Korean are read in various ways according to their context, number normalization in Korean is crucial to improving the quality of TTS systems. However, the existing model is based on ad hoc rules that are inappropriate for normalizing non-standard numbers. The purpose of this study was to propose a model of number normalization in Korean based on the sequence-to-sequence Transformer model. Moreover, number positional encoding was added to the model to handle long numbers. Overall, the proposed model achieved 98.80% f1 score in the normal test dataset and 90.1% in the non-standard test dataset, which were 2.52% and 19% higher, respectively, than the baseline model. In addition, the proposed model demonstrated a 13% improvement in the longer-number test dataset compared to the other deep learning models.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr