Digital Library[ Search Result ]
Enhancing LLM-based Zero-Shot Conversational Recommendation via Reasoning Path
Heejin Kook, Seongmin Park, Jongwuk Lee
http://doi.org/10.5626/JOK.2025.52.7.617
Conversational recommender systems provide personalized recommendations through bi-directional interactions with users. Traditional conversational recommender systems rely on external knowledge, such as knowledge graphs, to effectively capture user preferences. While recent rapid advancement of large language models has enabled zero-shot recommendations, challenges remain in understanding users' implicit preferences and designing optimal reasoning paths. To address these limitations, this study investigates the importance of appropriate reasoning path construction in zero-shot based conversational recommender systems and explores the potential of using a new approach based on this foundation. The proposed framework consists of two stages: (1) comprehensively extracting both explicit and implicit preferences from conversational context, and (2) constructing reasoning trees to select optimal reasoning paths based on these preferences. Experimental results on benchmark datasets INSPIRED and ReDial show that our proposed method achieves up to 11.77% improvement in Recall@10 compared to existing zero-shot methods, It even outperforms some learning-based models.
Aspect-Based Comparative Summarization with Large Language Model
http://doi.org/10.5626/JOK.2025.52.7.579
This paper proposes an aspect-based comparative summarization method to generate summary comparisons between two items based on their reviews, aiming to assist users in making informed decisions. Given the reviews of two items, aspects are dynamically generated from each review using a large language model. To identify common aspects for comparison, the generated aspect lists of both items are merged. The review sentences of each item are classified into the most relevant aspects, and then the summarization process removes redundant and unnecessary information. Subsequently, an abstract summary is generated for each common aspect to capture the overall content of the reviews. Experiments were conducted in the domains of hotels, electronic devices, and furniture, comparing human-written summaries with system-generated ones. The proposed method demonstrated superior summarization performance compared to existing comparison models.
Pretrained Large Language Model-based Drug-Target Binding Affinity Prediction for Mutated Proteins
Taeung Song, Jin Hyuk Kim, Hyeon Jun Park, Jonghwan Choi
http://doi.org/10.5626/JOK.2025.52.6.539
Drug development is a costly and time-consuming process. Accurately predicting the impact of protein mutations on drug-target binding affinity remains a major challenge. Previous studies have utilized long short-term memory (LSTM) and transformer models for amino acid sequence processing. However, LSTMs suffer from long-sequence dependency issues, while transformers face high computational costs. In contrast, pretrained large language models (pLLMs) excel in handling long sequences, yet prompt-based approaches alone are insufficient for accurate binding affinity prediction. This study proposed a method that could leverage pLLMs to analyze protein structural data, transform it into embedding vectors, and use a separate machine learning model for numerical binding affinity prediction. Experimental results demonstrated that the proposed approach outperformed conventional LSTM and prompt-based methods, achieving lower root mean square error (RMSE) and higher Pearson correlation coefficient (PCC), particularly in mutation-specific predictions. Additionally, performance analysis of pLLM quantization confirmed that the method maintained sufficient accuracy with reduced computational cost.
Safety Evaluation of Large Language Models Using Risky Humor
JoEun Kang, GaYeon Jung, HanSaem Kim
http://doi.org/10.5626/JOK.2025.52.6.508
This study evaluated the safety of generative language models through the lens of Korean humor that included socially risky content. Recently, concerns regarding the misuse of generative language models have intensified, as these models can generate plausible responses to inputs and prompts that may deviate from social norms, ethical standards, and common sense. In this context, this study aimed to identify and mitigate potential risks associated with artificial intelligence (AI) by analyzing risks inherent in humor and developing a benchmark for their evaluation. The socially risky humor examined in this study differs from conventional harmful content, as the playful and entertaining nature of humor can easily obscure unethical or risky elements. This characteristic closely resembles subtle and indirect input patterns, which are critical in AI safety assessments. The experiment involved binary classification of generated results from input requests related to unethical humor as safe or unsafe. Subsequently, the safety level of the experimental model was evaluated across four levels. Consequently, this study evaluated the safety of prominent generative language models, including GPT-4o, Gemini, and Claude. Findings indicated that these models demonstrated vulnerabilities in ethical judgment when faced with risky humor.
Enhancing Retrieval-Augmented Generation Through Zero-Shot Sentence-Level Passage Refinement with LLMs
Taeho Hwang, Soyeong Jeong, Sukmin Cho, Jong C. Park
http://doi.org/10.5626/JOK.2025.52.4.304
This study presents a novel methodology designed to enhance the performance and effectiveness of Retrieval-Augmented Generation (RAG) by utilizing Large Language Models (LLMs) to eliminate irrelevant content at the sentence level from retrieved documents. This approach refines the content of passages exclusively through LLMs, avoiding the need for additional training or data, with the goal of improving the performance in knowledge-intensive tasks. The proposed method was tested in an open-domain question answering (QA) environment, where it demonstrated its ability to effectively remove unnecessary content and outperform over traditional RAG methods. Overall, our approach has proven effective in enhancing performance compared to conventional RAG techniques and has shown the capability to improve RAG's accuracy in a zero-shot setting without requiring additional training data.
SyllaBERT: A Syllable-Based Efficient Robust Transformer Model for Real-World Noise and Typographical Errors
Seongwan Park, Yumin Heo, Youngjoong Ko
http://doi.org/10.5626/JOK.2025.52.3.250
Training a Korean language model necessitates the development of a tokenizer specifically designed for the unique features of the Korean language, making this a crucial step in the modeling process. Most current language models utilize morpheme-based or subword-based tokenization. While these approaches work well with clean Korean text data, they are prone to out-of-vocabulary (OOV) issues due to abbreviations and neologisms frequently encountered in real-world Korean data. Moreover, actual Korean text often contains various typos and non-standard expressions, to which traditional morpheme-based or subword-based tokenizers are not sufficiently robust. To tackle these challenges, this paper introduces the SyllaBERT model, which employs syllable-level tokenization to effectively address the specific characteristics of Korean, even in noisy and non-standard contexts, with minimal resources. A compact syllable-level vocabulary was created, and a syllable-based language model was developed by reducing the embedding and hidden layer sizes of existing models. Experimental results show that, despite having approximately four times fewer parameters than subword-based models, the SyllaBERT model outperforms them in natural language understanding tasks on real-world conversational Korean data that includes noise.
Explainable Supporting Facts Generation via Query-Focused Multi-Document Summarization for Open Domain Question Answering Model
http://doi.org/10.5626/JOK.2024.51.11.1020
"Open domain question answering system requires external knowledge not satisfied by knowledge inherent in the language model to answer a given query. It is a technology that is being studied importantly for solving the hallucination problem that occurs in recent large language models. In this paper, we propose a model that utilizes structural information of Query-attentive Semantic Graph (QSG) to summarize information between distant documents based on a query and utilize it as supporting factors for a multi-document-based question answering system. Query-based supporting factors generated by summarizing can improve answer generation performance and show better explainability than extracted supporting factors."
KULLM: Learning to Construct Korean Instruction-Following Large Language Models
Seungjun Lee, Yoonna Jang, Jeongwook Kim, Taemin Lee, Heuiseok Lim
http://doi.org/10.5626/JOK.2024.51.9.817
The emergence of Large Language Models (LLMs) has revolutionized the research paradigm in natural language processing. While instruction-tuning techniques have been pivotal in enhancing LLM performance, the majority of current research has focused predominantly on English. This study addresses the need for multilingual approaches by presenting a method for developing and evaluating Korean instruction-following models. We fine-tuned LLM models using Korean instruction datasets and conducted a comprehensive performance analysis using various dataset combinations. The resulting Korean instruction-following model is made available as an open-source resource, contributing to the advancement of Korean LLM research. Our work aims to bridge the language gap in LLM development and promote more inclusive AI technologies.
SCA: Improving Document Grounded Response Generation based on Supervised Cross-Attention
Hyeongjun Choi, Seung-Hoon Na, Beomseok Hong, Youngsub Han, Byoung-Ki Jeon
http://doi.org/10.5626/JOK.2024.51.4.326
Document-grounded response generation is the task of aiming at generating conversational responses by “grounding” the factual evidence on task-specific domain, such as consumer consultation or insurance planning, where the evidence is obtained from the retrieved relevant documents in response to a user’s question under the current dialogue context. In this study, we propose supervised cross-attention (SCA) to enhance the ability of the response generation model to find and incorporate “response-salient snippets” (i.e., spans or contents), which are parts of the retrieved document that should be included and maintained in the actual answer generation. SCA utilizes the additional supervised loss that focuses cross-attention weights on the response-salient snippets, and this attention supervision likely enables a decoder to effectively generate a response in a “saliency-grounding” manner, by strongly attending to the important parts in the retrieved document. Experiment results on MultiDoc2Dial show that the use of SCA and additional performance improvement methods leads to the increase of 1.13 in F1 metric over the existing SOTA, and reveals that SCA leads to the increase of 0.25 in F1.
Voice Phishing Detection Scheme Using a GPT-3.5-based Large Language Model
http://doi.org/10.5626/JOK.2024.51.1.67
In this paper, we introduce a novel approach for voice phishing call detection, using text-davinci-003, which is a recently updated model from the generative pre-trained transformer (GPT) -3.5 language model series. To achieve this, we devised a prompt to let the language model respond with an integer ranging from 0 to 10, which indicates the likelihood that a given conversation is a voice phishing attempt. For prompt tuning, hyperparameter adjustment, and performance validation,we use a total of 105 actual Korean voice phishing transcripts and 704 transcripts from various topics of general conversations as our dataset. The proposed scheme includes a function to send voice phishing alarm during a call and a function to finally determine whether the call was a voice phishing after the call ends. Performance is evaluated in five different scenarios using different types of training and test data, demonstrating an accuracy range of 0.95 to 0.97 for the proposed technique. In particular, when tested with data from sources different from those used in training, the proposed scheme performs better than the existing bidirectional encoder representations from transformer (BERT) model-based schemes.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr