Character-based Subtitle Generation by Learning of Multimodal Concept Hierarchy from Cartoon Videos 


Vol. 42,  No. 4, pp. 451-458, Apr.  2015


PDF

  Abstract

Previous multimodal learning methods focus on problem-solving aspects, such as image and video search and tagging, rather than on knowledge acquisition via content modeling. In this paper, we propose the Multimodal Concept Hierarchy (MuCH), which is a content modeling method that uses a cartoon video dataset and a character-based subtitle generation method from the learned model. The MuCH model has a multimodal hypernetwork layer, in which the patterns of the words and image patches are represented, and a concept layer, in which each concept variable is represented by a probability distribution of the words and the image patches. The model can learn the characteristics of the characters as concepts from the video subtitles and scene images by using a Bayesian learning method and can also generate character-based subtitles from the learned model if text queries are provided. As an experiment, the MuCH model learned concepts from ‘Pororo’ cartoon videos with a total of 268 minutes in length and generated character-based subtitles. Finally, we compare the results with those of other multimodal learning models. The Experimental results indicate that given the same text query, our model generates more accurate and more character-specific subtitles than other models.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

K. Kim, J. Ha, B. Lee, B. Zhang, "Character-based Subtitle Generation by Learning of Multimodal Concept Hierarchy from Cartoon Videos," Journal of KIISE, JOK, vol. 42, no. 4, pp. 451-458, 2015. DOI: .


[ACM Style]

Kyung-Min Kim, Jung-Woo Ha, Beom-Jin Lee, and Byoung-Tak Zhang. 2015. Character-based Subtitle Generation by Learning of Multimodal Concept Hierarchy from Cartoon Videos. Journal of KIISE, JOK, 42, 4, (2015), 451-458. DOI: .


[KCI Style]

김경민, 하정우, 이범진, 장병탁, "멀티모달 개념계층모델을 이용한 만화비디오 컨텐츠 학습을 통한 등장인물 기반 비디오 자막 생성," 한국정보과학회 논문지, 제42권, 제4호, 451~458쪽, 2015. DOI: .


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr