TY - JOUR T1 - Syllable-based Probabilistic Models for Korean Morphological Analysis AU - Shim, Kwangseob JO - Journal of KIISE, JOK PY - 2014 DA - 2014/9/14 DO - KW - Korean morphology KW - morphological analysis KW - probabilistic model KW - POS-tagged corpus KW - machine learning AB - This paper proposes three probabilistic models for syllable-based Korean morphological analysis, and presents the performance of proposed probabilistic models. Probabilities for the models are acquired from POS-tagged corpus. The result of 10-fold cross-validation experiments shows that 98.3% answer inclusion rate is achieved when trained with Sejong POS-tagged corpus of 10 million eojeols. In our models, POS tags are assigned to each syllable before spelling recovery and morpheme generation, which enables more efficient morphological analysis than the previous probabilistic models where spelling recovery is performed at the first stage. This efficiency gains the speed-up of morphological analysis. Experiments show that morphological analysis is performed at the rate of 147K eojeols per second, which is almost 174 times faster than the previous probabilistic models for Korean morphology.