Digital Library[ Search Result ]
Korean Text Summarization using MASS with Relative Position Representation
Youngjun Jung, Hyunsun Hwang, Changki Lee
http://doi.org/10.5626/JOK.2020.47.9.873
In the language generation task, deep learning-based models that generate natural languages using a Sequence-to-Sequence model are actively being studied. In the field of text summarization, wherein the method of extracting only the core sentences from the text is used, an abstract summarization study is underway. Recently, a transfer learning method of fine-tuning using pre-training model based on large amount of monolingual data such as BERT and MASS has been mainly studied in the field of natural language processing. In this paper, after pre-training for the Korean language generation using MASS, it was applied to the summarization of the Korean text. As a result of the experiment, the Korean text summarization model using MASS was higher performance than the existing models. Additionally, the performance of the text summarization model was improved by applying the relative position representation method to MASS.
English-to-Korean Machine Translation using Image Information
Jangseong Bae, Hyunsun Hwang, Changki Lee
http://doi.org/10.5626/JOK.2019.46.7.690
Machine translation automatically converts a text in one language into another language. Conventional machine translations use only texts for translation which is a disadvantage in that various information related to input text cannot be utilized. In recent years, multimodal machine translation models have emerged that use images related to input text as additional inputs, unlike conventional machine translations which use only textual data. In this paper, image information was added at decoding time of machine translation according to recent research trends and used for English-to-Korean automated translation. In addition, we propose a model with a decoding gate to adjust the textual and image information at the decoding time. Our experimental results show that the proposed method resulted in better performance than the non-gated model.
Word Embedding using Relative Position Information between Words
Hyunsun Hwang, Changki Lee, HyunKi Jang, Dongho Kang
http://doi.org/10.5626/JOK.2018.45.9.943
In Word embedding, which is used to apply deep learning to natural language processing, a word is expressed on a vector space. This has the advantage of dimension reduction, whereby similar words have similar vector values. Word embedding needs to learn large-scale corpus to get achieve good performance. However, the word2vec model, which has frequently been used in the past, has a disadvantage in that it does not use relative position information between words because it largely learns the word appearance rate by simplifying the model for large capacity corpus learning. In this paper, we modified the existing word embedding learning model to enable it to learn using relative position information between words. Experimental results show that the performance of the word-analogy of the proposed modified word embedding learning model is improved when word embedding is learned using relative position information between words.
Compression of Korean Phrase Structure Parsing Model using Knowledge Distillation
http://doi.org/10.5626/JOK.2018.45.5.451
A sequence-to-sequence model is an end-to-end model that transforms an input sequence into an output sequence of different lengths. However, it is difficult to apply to an actual service by using techniques such as attention mechanism and input-feeding to achieve high performance. In this paper, we apply the sequence-level knowledge distillation for natural language processing to the Korean phrase structure parsing, which is an effective technique for compressing the model. Experimental results show that when the size of the hidden layer is decreased from 500 to 50, the performance of F1 0.56% is improved and the speed is 60.71 times faster than that of the baseline model.
Error Correction in Korean Morpheme Recovery using Deep Learning
Korean Morphological Analysis is a difficult process. Because Korean is an agglutinative language, one of the most important processes in Morphological Analysis is Morpheme Recovery. There are some methods using Heuristic rules and Pre-Analyzed Partial Words that were examined for this process. These methods have performance limits as a result of not using contextual information. In this study, we built a Korean morpheme recovery system using deep learning, and this system used word embedding for the utilization of contextual information. In ‘들/VV’ and ‘듣/VV’ morpheme recovery, the system showed 97.97% accuracy, a better performance than with SVM(Support Vector Machine) which showed 96.22% accuracy.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr