Digital Library[ Search Result ]
An Experimental Study on the Text Generation Capability for Chart Image Descriptions in Korean SLLM
http://doi.org/10.5626/JOK.2025.52.2.132
This study explores the capability of using Small Large Language Models(SLLMs) for automatically generating and interpreting information from chart images. To achieve this goal, we built an instruction dataset for SLLM training by extracting text data from chart images and adding descriptive information. We conducted instruction tuning on a Korean SLLM and evaluated its ability to generate information from chart images. The experimental results demonstrated that the SLLM, which was fine-tuned with the constructed instruction dataset, was capable of generating descriptive text comparable to OpenAI's GPT-4o-mini API. This study suggests that, in the future, Korean SLLMs may be effectively used for generating descriptive text and providing information across a broader range of visual data.
A VQG Framework for Accurate and Diverse Question Generation
http://doi.org/10.5626/JOK.2025.52.1.62
Visual Question Generation (VQG) aims to generate questions based on a given image, often utilizing additional information such as answers or answer types if necessary. A VQG system should be able to generate diverse questions for a single image, while maintaining relevance to the image alongside its additional information. However, models that highly focus on relevance to the image might overfit to the dataset, leading to limited diversity, while those that emphasize diversity might generate questions less related to the input. Therefore, balancing these two aspects is crucial in VQG. To address this challenge, we proposed BCVQG (BLIP-CVAE VQG), a system that could integrate a pre-trained vision-language model with a Conditional Variational AutoEncoder (CVAE). The effectiveness of the proposed method was validated through quantitative and qualitative evaluations on the VQA2.0 dataset.
Knowledge-based Supporting Facts Generation Model for Question and Answer
http://doi.org/10.5626/JOK.2023.50.11.940
In this study, we intend to create supporting facts from the knowledge base to add information to the question and answer process, and provide a form that is easy for humans to read. Data from two knowledge bases, DBpedia and Wikidata, related to supporting documents in HotpotQA were collected through crawling, and the supporting facts generators were trained using collected triples. The answer generator was trained with generated supporting facts and questions as inputs. Regardless of both DBpedia and Wikidata, supporting facts generated based on the knowledge base improved answer generation performance by providing positive additional information about questions, and generated human-understandable sentences.
PrefixLM for Korean Text Summarization
Kun-Hui Lee, Seung-Hoon Na, Joon-Ho Lim, Tae-Hyeong Kim, Du-Seong Chang
http://doi.org/10.5626/JOK.2022.49.6.475
In this paper, we examine the effectiveness of PrefixLM that consists of half of the parameters of the T5"s encoder-decoder architecture for Korean text generation tasks. Different from T5 where input and output sequences are separately provided, the transformer block of PrefixLM takes a single sequence that concatenates both input and output sequences. By designing the attention mask, PrefixLM performs uni- and bi-directional attentions on input and output sequences, respectively, thereby enabling to perform two roles of encoder and decoder with a single transformer block. Experiment results on Korean abstractive document summarization task show that PrefixLM leads to performance increases of 2.17 and 2.78 more than 2 in Rouge-F1 score over BART and T5, respectively, implying that the PrefixLM is promising in Korean text generation tasks.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr