Digital Library[ Search Result ]
Boosting Image Caption Generation with Parts of Speech
Philgoo Kang, Yubin Lim, Hyoungjoo Kim
http://doi.org/10.5626/JOK.2021.48.3.317
With the integration of smart devices and reliance on AI into our daily lives, the ability to generate image caption is becoming increasingly important in various fields such as guidance for visually-impaired individuals, human-computer interaction and so on. In this paper, we propose a novel approach based on parts of speech (POS), such as nouns and verbs extracted from image to enhance the image caption generation. The proposed model exploits multiple CNN encoders, which were specifically trained to identify features related to POS, and feed them into an LSTM decoder to generate image captions. We conducted experiments involving both Flickr30k and MS-COCO datasets using several text metrics and additional human surveys to validate the practical effectiveness of the proposed model.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr