Search : [ keyword: Language Model ] (51)

Study on the Evaluation of Embedding Models in the Natural Language Processing

Hanhoon Kang

http://doi.org/10.5626/JOK.2025.52.2.141

This paper applies embedding techniques to key tasks in the field of Natural Language Processing (NLP), including semantic textual search, text classification, question answering, and clustering, and evaluates their performance. Recently, with the advancement of large-scale language models, embedding technologies have played a crucial role in various NLP applications. Several types of embedding models have been publicly released, and this paper assesses the performance of these models. For this evaluation, vector representations generated by embedding models were used as an intermediate step for each selected task. The experiments utilized publicly available Korean and English datasets, and five NLP tasks were defined. Notably, the BGE-M3 model, which demonstrated exceptional performance in multilingual, cross-lingual, and long-document retrieval tasks, was a key focus of this study. The experimental results show that the BGE-M3 model outperforms other models in three of the evaluated NLP tasks. The findings of this research are expected to provide guidance in selecting embedding models for identifying similar sentences or documents in recent Retrieval-Augmented Generation (RAG) applications.

An Experimental Study on the Text Generation Capability for Chart Image Descriptions in Korean SLLM

Hyojun An, Sungpil Choi

http://doi.org/10.5626/JOK.2025.52.2.132

This study explores the capability of using Small Large Language Models(SLLMs) for automatically generating and interpreting information from chart images. To achieve this goal, we built an instruction dataset for SLLM training by extracting text data from chart images and adding descriptive information. We conducted instruction tuning on a Korean SLLM and evaluated its ability to generate information from chart images. The experimental results demonstrated that the SLLM, which was fine-tuned with the constructed instruction dataset, was capable of generating descriptive text comparable to OpenAI's GPT-4o-mini API. This study suggests that, in the future, Korean SLLMs may be effectively used for generating descriptive text and providing information across a broader range of visual data.

Political Bias in Large Language Models and its Implications on Downstream Tasks

Jeong yeon Seo, Sukmin Cho, Jong C. Park

http://doi.org/10.5626/JOK.2025.52.1.18

This paper contains examples of political leaning bias that can be offensive. Abstract As the performance of the Large Language Models (LLMs) improves, direct interaction with users becomes possible, raising ethical issues. In this study, we design two experiments to explore the diverse spectrum of political stances that an LLM exhibits and how these stances affect downstream tasks. We first define the inherent political stances of the LLM as the baseline and compare results from three different inputs (jailbreak, political persona, and jailbreak persona). The results of the experiments show that the political stances of the LLM changed the most with the jailbreak attack, while lesser changes were observed with the other two inputs. Moreover, an experiment involving downstream tasks demonstrated that the distribution of altered inherent political stances can affect the outcome of these tasks. These results suggest that the model generates responses that align more closely like its inherent stance rather than the user’s intention to personalize responses. We conclude that the intrinsic political bias of the model and its judgments can be explicitly communicated to users.

Chain-of-Thought and Chain-of-Verification Prompting for Grammar-based Test Case Generation

Aditi, Sang-Ki Ko

http://doi.org/10.5626/JOK.2025.52.1.29

Software testing is an essential but cost-intensive work in the software development process. Automatic test case generation tools are utilized to distinguish between the correct and the incorrect solutions more effectively than manually generating them. Many researchers have recently proposed deep learning-based methods to generate test cases automatically for given logical specifications of problems or programs. In this work, we propose teaching the large language models (LLMs) such as ChatGPT and Google Gemini to generate ‘test case grammars’ from problem specifications, particularly using the chain-of-thought (CoT) prompting. Additionally, we implemented it using the CoT to verify and by providing the details of generalized rules to the LLMs, termed “chain-of-verification” (CoVe). We further evaluate our method with the publicly available dataset, DeepMind CodeContests dataset, which consists of numerous programming problems ranging from beginner to advanced level and is submitted by programming students with test cases for verifying the correctness of programs.

Adversarial Training with Contrastive Learning in NLP

Daniela N. Rim, DongNyeong Heo, Heeyoul Choi

http://doi.org/10.5626/JOK.2025.52.1.52

Adversarial training has been extensively studied in natural language processing (NLP) settings to make models robust so that similar inputs derive similar outcomes semantically. However, since language has no objective measure of semantic similarity, previous works use an external pre-trained NLP model to ensure this similarity, introducing an extra training stage with huge memory consumption. This work proposes adversarial training with contrastive learning (ATCL) to train a language processing model adversarially using the benefits of contrastive learning. The core idea is to make linear perturbations in the embedding space of the input via fast gradient methods (FGM) and train the model to keep the original and perturbed representations close via contrastive learning. We apply ATCL to language modeling and neural machine translation tasks showing an improvement in the quantitative (perplexity and BLEU) scores. Furthermore, ATCL achieves good qualitative results in the semantic level for both tasks without using a pre-trained model through simulation.

BERT-based Two-Stage Classification Models and Co-Attention Mechanism for Diagnosing Dementia and Schizophrenia-related Disease

Min-Kyo Jung, Seung-Hoon Na, Ko Woon Kim, Byoung-Soo Shin, Young-Chul Chung

http://doi.org/10.5626/JOK.2022.49.12.1071

Noting the recently increasing number of patients, we present deep learning methods for automatically diagnosing dementia and schizophrenia by exploring the use of the novel two-stage classification and the co-attention mechanism. First, the two-stage classification consists of two steps-the perplexity-based classification and the standard BERT-based classification. 1) the perplexity-based classification first prepares two types of BERTs, i.e., control-specific and patients-specific BERTs, pretrained from transcripts for controls and patients as the additional pretraining datasets, respectively, and then performs a simple threshold-based classification based on the difference between perplexity values of two BERTs for an input test transcript; then, for ambiguous cases where the perplexity difference only does not provide sufficient evidence for the classification, the standard BERT-based classification is performed based on a fine-tuned BERT. Second, the co-attention mechanism enriches the BERT-based representations from a doctor’s transcript and a client’s one by applying the cross-attention over them using the shared affinity matrix, and performs the classification based on the enriched co-attentive representations. Experiment results on a large-scale dataset of Korean transcripts show that the proposed two-stage classification outperforms the baseline BERT model on 4 out of 7 subtasks and the use of the co-attention mechanism achieves the best F1 score for 4 out of 8 subtasks.

Solving Korean Math Word Problems Using the Graph and Tree Structure

Kwang Ho Bae, Sang Yeop Yeo, Yu Chul Jung

http://doi.org/10.5626/JOK.2022.49.11.972

In previous studies, there have been various efforts to solve math word problems in the English sentence system. In many studies, improved performance was achieved by introducing structures such as trees and graphs, beyond the Sequence-to-Sequence approaches. However, in the study of solving math problems in Korean sentence systems, there are no model cases, using structures such as trees or graphs. Thus, in this paper, we examine the possibility of solving math problems in Korean sentence systems for models using the tree structure, graph structure, and Korean pre-training language models together. Our experimental results showed that accuracy improved by approximately 20%, compared to the model of the Seq2seq structure, by introducing the graph and tree structure. Additionally, the use of the Korean pre-training language model showed an accuracy improvement of 4.66%-5.96%.

Recommendation Technique for Bug Fixers by Fine-tuning Language Models

Dae-Sung Wang, Hoon Seong, Chan-Gun Lee

http://doi.org/10.5626/JOK.2022.49.11.987

The scale and complexity of software continue to increase; hence they contribute to the occurrence of diverse bugs. Therefore, the necessity of systematic bug management has been raised. A few studies have proposed automating the assignment of bug fixers using word-based deep learning models. However, their accuracy is not satisfactory due to context of the word is ignored, and there is an excessive number of classes. In this paper, the accuracy was improved by about 27%p over the top-10 accuracies by using a fine-tuned pre-trained language model based on BERT, RoBERTa, DeBERTa, and CodeBERT. Experiments confirmed that the accuracy was about 70%. Through this, we showed that the fine-tuned pretrained language model could be effectively applied to automated bug-fixer assignments.

Structuralized External Knowledge and Multi-task Learning for Knowledge Selection

Junhee Cho, Youngjoong Ko

http://doi.org/10.5626/JOK.2022.49.10.884

Typically, task-oriented dialog systems use well-structured knowledge, such as databases, to generate the most appropriate responses to users" questions. However, to generate more appropriate and fluent responses, external knowledge, which is unstructured text data such as web data or FAQs, is necessary. In this paper, we propose a novel multi-task learning method with a pre-trained language model and a graph neural network. The proposed method makes the system select the external knowledge effectively by not only understanding linguistic information but also grasping the structural information latent in external knowledge which is converted into structured data, graphs, using a dependency parser. Experimental results show that our proposed method obtains higher performance than the traditional bi-encoder or cross-encoder methods that use pre-trained language models.

Entity Graph Based Dialogue State Tracking Model with Data Collection and Augmentation for Spoken Conversation

Haeun Yu, Youngjoong Ko

http://doi.org/10.5626/JOK.2022.49.10.891

As a part of a task-oriented dialogue system, dialogue state tracking is a task for understanding the dialogue and extracting user’s need in a slot-value form. Recently, Dialogue System Track Challenge (DSTC) 10 Track 2 initiated a challenge to measure the robustness of a dialogue state tracking model in a spoken conversation setting. The released evaluation dataset has three characteristics: new multiple value scenario, three-times more entities, and utterances from automatic speech recognition module. In this paper, to ensure the model’s robust performance, we introduce an extraction-based dialogue state tracking model with entity graph. We also propose to use data collection and template-based data augmentation method. Evaluation results prove that our proposed method improves the performance of the extraction-based dialogue state tracking model by 1.7% of JGA and 0.57% of slot accuracy compared to baseline model.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr