Search : [ author: WonKee Lee ] (4)

Movie Summarization Based on Emotion Dynamics and Multimodal Information

Myungji Lee, Hongseok Kwon, WonKee Lee, Jong-Hyeok Lee

http://doi.org/10.5626/JOK.2022.49.9.735

Movie summarization is the task of summarizing a full-length movie by creating a short video summary containing its most informative scenes. This paper proposes an automatic movie summarization model that comprehensively considers the three main elements of the movie: characters, plot, and video information for movie summary. To accurately identify major events on the movie plot, we propose a Transformer-based architecture that uses the movie script"s dialogue information and the main characters" emotion dynamics information as model training features, and then combines the script and video information. Through experiments, the proposed method is shown to be helpful in increasing the accuracy of identifying major events in movies and consequently improves the quality of movie summaries.

Quality Estimation of Machine Translation using Dual-Encoder Architecture

Dam Heo, Wonkee Lee, Jong-Hyeok Lee

http://doi.org/10.5626/JOK.2022.49.7.521

Quality estimation (QE) is the task of estimating the quality of given machine translations (MTs) without their reference translations. A recent research trend is to apply transfer learning to a pre-training model based on Transformer encoder with a parallel corpus in QE. In this paper, we proposed a dual-encoder architecture that learns a monolingual representation of each respective language in encoders. Thereafter, it learns a cross-lingual representation of each language in cross-attention networks. Thus, it overcomes the limitations of a single-encoder architecture in cross-lingual tasks, such as QE. We proved that the dual-encoder architecture is structurally more advantageous over the single-encoder architecture and furthermore, improved the performance and stability of the dual-encoder model in QE by applying the pre-trained language model to the dual-encoder model. Experiments were conducted on WMT20 QE data for En-De pair. As pre-trained models, our model employs English BERT (Bidirectional Encoder Representations from Transformers) and German BERT to each encoder and achieves the best performance.

Alleviation of Generic Responses by Adjusting N-gram Usage in Neural Chit-chat Dialogue Systems

JaeYoung Oh, WonKee Lee, Jeesoo Bang, Jaehun Shin, Jong-Hyeok Lee

http://doi.org/10.5626/JOK.2022.49.1.60

Chit-chat dialogue systems, the systems for unstructured conversations between humans and computer, aim to generate meaningful and diverse responses. However, training methods based on the maximum likelihood estimation have been reported to generate too many generic responses by the model; thus, reducing the interest in these systems. Recently, a new training method using unlikelihood training was proposed to generate diverse responses by penalizing the overuse of each vocab. However, it has a limitation that it only considers the usage of a token when penalizing each word, and does not consider in what context each token is used. Therefore, we propose a method by extending this work, which is penalizing the overuse of each n-gram. This method has the advantage of using information about the surrounding context in n-gram to penalize each token.

Quality Estimation of English-Korean Machine Translation using Neural Network based Predictor-Estimator Model

Hyun Kim, Jaehun Shin, Wonkee Lee, Seungwoo Cho, Jong-Hyeok Lee

http://doi.org/10.5626/JOK.2018.45.6.545

Quality Estimation (QE) for machine translation is an automatic method for estimating the quality of machine translation output without the need to use reference translations. QE has recently grown in importance in the field of machine translation (MT). Recent studies on QE have mainly focused on European languages, whereas fewer studies have been carried out on QE for Korean. In this paper, we create a new QE dataset for English to Korean translations and apply a neural network based Predictor-Estimator model to a QE task of English-Korean. Creating a QE dataset requires manual post-edited translations for MT outputs. Because Korean is a free word order language and allows various writing styles for translation, we provide guidance for creating manual post-edited Korean translations for English-Korean QE data. Also, we alleviate the imbalanced data problem of QE data. Finally, this paper reports on our experimental results of the QE task of English-Korean by using the Predictor-Estimator model trained from the created English-Korean QE data.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr