Syllable-based Probabilistic Models for Korean Morphological Analysis 


Vol. 41,  No. 9, pp. 642-651, Sep.  2014


PDF

  Abstract

This paper proposes three probabilistic models for syllable-based Korean morphological analysis, and presents the performance of proposed probabilistic models. Probabilities for the models are acquired from POS-tagged corpus. The result of 10-fold cross-validation experiments shows that 98.3% answer inclusion rate is achieved when trained with Sejong POS-tagged corpus of 10 million eojeols. In our models, POS tags are assigned to each syllable before spelling recovery and morpheme generation, which enables more efficient morphological analysis than the previous probabilistic models where spelling recovery is performed at the first stage. This efficiency gains the speed-up of morphological analysis. Experiments show that morphological analysis is performed at the rate of 147K eojeols per second, which is almost 174 times faster than the previous probabilistic models for Korean morphology.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

K. Shim, "Syllable-based Probabilistic Models for Korean Morphological Analysis," Journal of KIISE, JOK, vol. 41, no. 9, pp. 642-651, 2014. DOI: .


[ACM Style]

Kwangseob Shim. 2014. Syllable-based Probabilistic Models for Korean Morphological Analysis. Journal of KIISE, JOK, 41, 9, (2014), 642-651. DOI: .


[KCI Style]

심광섭, "한국어 형태소 분석을 위한 음절 단위 확률 모델," 한국정보과학회 논문지, 제41권, 제9호, 642~651쪽, 2014. DOI: .


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr