Digital Library[ Search Result ]
Korean End-to-End Coreference Resolution with BERT for Long Document
Kyeongbin Jo, Youngjun Jung, Changki Lee, Jihee Ryu, Joonho Lim
http://doi.org/10.5626/JOK.2023.50.1.32
Coreference resolution is a natural language processing task that identifies mentions that are coreference resolution targets, identifies mentions that refer to the same entity, and groups them together. Recently, in coreference resolution, an end-to-end model using BERT to derive the context expression of a word while simultaneously performing mention detection and coreference resolution has been mainly studied. However, BERT has the problem of reduced performance for long documents due to its input length limit. Therefore, in this paper, the following model is proposed. First, a lengthy document is split into tokens of 512 or fewer tokens, extracted from an existing local BERT to obtain the primary contextual expression of a word, and then recombined to compute and add a globalpositional embedding value for the original document. Finally, a coreference resolution was performed by computing the entire context expression with the Global BERT layer. As a result of the experiment, the model proposed in this paper showed similar performance to the existing model, while the GPU memory usage decreased by 1.4 times and the speed improved by 2.1 times.
Style Transfer for Chat Language using Unsupervised Machine Translation
Youngjun Jung, Changki Lee, Jeongin Hwang, Hyungjong Noh
http://doi.org/10.5626/JOK.2023.50.1.19
Style transfer is the task of generating text of a target style while maintaining content of given text written in a source style. In general, it is assumed that the content is an invariant and the style is variable when the style of the text is transferred. However, in the case of chat language, there is a problem in that it is not well trained by existing style transfer model. In this paper, we proposed a method of transfer chat language into written language using a style transfer model with unsupervised machine translation. This study shows that it is possible to construct a word transfer dictionary between styles that can be used for style transfer by utilizing transferred results. Additionally, it shows that transferred results can be improved by applying a filtering method to transferred result pair so that only well transferred results can be used and by training the style transfer model using a supervised learning method with filtered results.
Korean Text Summarization using MASS with Copying and Coverage Mechanism and Length Embedding
Youngjun Jung, Changki Lee, Wooyoung Go, Hanjun Yoon
http://doi.org/10.5626/JOK.2022.49.1.25
Text summarization is a technology that generates a summary including important and essential information from a given document, and an end-to-end abstractive summarization model using a sequence-to-sequence model is mainly studied. Recently, a transfer learning method that performs fine-tuning using a pre-training model based on large-scale monolingual data has been actively studied in the field of natural language processing. In this paper, we applied the copying mechanism method to the MASS model, conducted pre-training for Korean language generation, and then applied it to Korean text summarization. In addition, coverage mechanism and length embedding were additionally applied to improve the summarization model. As a result of the experiment, it was shown that the Korean text summarization model, which applied the copying and coverage mechanism method to the MASS model, showed a higher performance than the existing models, and that the length of the summary could be adjusted through length embedding.
English-Korean Neural Machine Translation using MASS with Relative Position Representation
Youngjun Jung, Cheoneum Park, Changki Lee, Junseok Kim
http://doi.org/10.5626/JOK.2020.47.11.1038
Neural Machine Translation has been mainly studied for a Sequence-to-Sequence model using supervised learning. However, since the supervised learning method shows low performance when the data is insufficient, recently, a transfer learning method of fine-tuning using the pre-training model based on a large amount of monolingual data such as BERT and MASS has been mainly studied in the field of natural language processing. In this paper, MASS using the pre-training method for language generation, was applied to the English-Korean machine translation. As a result of the experiment, the performance of the English-Korean machine translation model using MASS showed better performance than the existing models, and the performance of the machine translation model was further improved by applying the relative position representation method to MASS.
Korean Text Summarization using MASS with Relative Position Representation
Youngjun Jung, Hyunsun Hwang, Changki Lee
http://doi.org/10.5626/JOK.2020.47.9.873
In the language generation task, deep learning-based models that generate natural languages using a Sequence-to-Sequence model are actively being studied. In the field of text summarization, wherein the method of extracting only the core sentences from the text is used, an abstract summarization study is underway. Recently, a transfer learning method of fine-tuning using pre-training model based on large amount of monolingual data such as BERT and MASS has been mainly studied in the field of natural language processing. In this paper, after pre-training for the Korean language generation using MASS, it was applied to the summarization of the Korean text. As a result of the experiment, the Korean text summarization model using MASS was higher performance than the existing models. Additionally, the performance of the text summarization model was improved by applying the relative position representation method to MASS.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr