TY - JOUR T1 - Probabilistic Segmentation and Tagging of Unknown Words AU - Kim, Bogyum AU - Lee, Jae Sung JO - Journal of KIISE, JOK PY - 2016 DA - 2016/1/14 DO - KW - unknown word processing KW - word segmentation KW - open word class processing KW - probabilistic morphological analysis AB - Processing of unknown words such as proper nouns and newly coined words is important for a morphological analyzer to process documents in various domains. In this study, a segmentation and tagging method for unknown Korean words is proposed for the 3-step probabilistic morphological analysis. For guessing unknown word, it uses rich suffixes that are attached to open class words, such as general nouns and proper nouns. We propose a method to learn the suffix patterns from a morpheme tagged corpus, and calculate their probabilities for unknown open word segmentation and tagging in the probabilistic morphological analysis model. Results of the experiment showed that the performance of unknown word processing is greatly improved in the documents containing many unregistered words.