Journal of KIISE

Search : [ author: 김진 ] (25)

Drug development is a costly and time-consuming process. Accurately predicting the impact of protein mutations on drug-target binding affinity remains a major challenge. Previous studies have utilized long short-term memory (LSTM) and transformer models for amino acid sequence processing. However, LSTMs suffer from long-sequence dependency issues, while transformers face high computational costs. In contrast, pretrained large language models (pLLMs) excel in handling long sequences, yet prompt-based approaches alone are insufficient for accurate binding affinity prediction. This study proposed a method that could leverage pLLMs to analyze protein structural data, transform it into embedding vectors, and use a separate machine learning model for numerical binding affinity prediction. Experimental results demonstrated that the proposed approach outperformed conventional LSTM and prompt-based methods, achieving lower root mean square error (RMSE) and higher Pearson correlation coefficient (PCC), particularly in mutation-specific predictions. Additionally, performance analysis of pLLM quantization confirmed that the method maintained sufficient accuracy with reduced computational cost.

SPI: Similar Patch Identifier for Automated Program Repair

Sechang Jang, Seongbin Kim, Junhyeok Choi, Jindae Kim, Jaechang Nam

http://doi.org/10.5626/JOK.2025.52.2.152

The primary challenge in Automated Program Repair (APR) techniques is the size of search space. In this study, we introduce a novel approach called Similar Patch Identifier (SPI), which reduces the search space by leveraging the similarities among bug-introducing changes and suggesting suitable repair operators. We evaluate this approach using the existing context-based APR tool, ConFix, and the Java defect benchmark, Defects4J. Our experiments revealed that, although SPI narrowed the search space to 10 candidate bug-fixing commits for each defect, it successfully generated meaningful patches for four bugs that ConFix was unable to repair.

EnhPred: Deep Learning Model for Precise Prediction of Enhancer Positions

Jinseok Kim, Suyeon Wy, Jaebum Kim

http://doi.org/10.5626/JOK.2025.52.1.35

Enhancers are crucial regulatory elements that control gene expression in living organisms. Therefore, enhancer prediction is essential for a deeper understanding of gene regulation mechanisms. However, precise enhancer prediction is challenging due to their variable lengths and distant target genes. Existing artificial intelligence-based enhancer prediction methods often predict enhancers without identifying their boundaries accurately. In this study, we developed a new deep learning-based enhancer prediction method called EnhPred, which consisted of Convolutional Neural Networks (CNN) and bidirectional Gated Recurrent Units (GRU). To predict enhancer regions with a high resolution, we designed EnhPred to predict probabilities of enhancer presence within narrow segmented genomic regions. When evaluated with existing machine learning- and deep learning-based methods using data from three human cell lines, EnhPred demonstrated superior performances in terms of accuracy of enhancer prediction and resolution of enhancer boundaries.

Task-Oriented Dialogue System Using a Fusion Module between Knowledge Graphs

Jinyoung Kim, Hyunmook Cha, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.10.882

The field of Task-Oriented Dialogue Systems focuses on using natural language processing to assist users in achieving specific tasks through conversation. Recently, transformer-based pre-trained language models have been employed to enhance performances of task-oriented dialogue systems. This paper proposes a response generation model based on Graph Attention Networks (GAT) to integrate external knowledge data into transformer-based language models for more specialized responses in dialogue systems. Additionally, we extend this research to incorporate information from multiple graphs, leveraging information from more than two graphs. We also collected and refined dialogue data based on music domain knowledge base to evaluate the proposed model. The collected dialogue dataset consisted of 2,076 dialogues and 226,823 triples. In experiments, the proposed model showed a performance improvement of 13.83%p in ROUGE-1, 8.26%p in ROUGE-2, and 13.5%p in ROUGE-L compared to the baseline KoBART model on the proposed dialogue dataset.

Managing DISCARD Commands in F2FS File System for Improving Lifespan and Performance of SSD Devices

Jinwoong Kim, Donghyun Kang, Young Ik Eom

http://doi.org/10.5626/JOK.2024.51.8.669

The DISCARD command is an interface that helps improve the lifespan and performance of SSDs by informing the SSD devices about invalid file system blocks. However, in the F2FS file system, the DISCARD command is only sent to the SSD during idle time, which limits the potential for improving lifespan and performance. In this paper, we propose an EPD scheme to efficiently transfer DISCARD commands during short idle times, as well as a seg-ment allocation scheme called PSA, which replaces DISCARD commands with overwrite commands. To evaluate the effectiveness of these proposed schemes, we conducted several experiments using various workloads to verify the lifespan and performance of real SSD devices. The results showed that the proposed schemes can improve the write amplification factor (WAF) by up to 40% and throughput by up to 160%, when compared to the traditional F2FS file system.

A Survey of Advantages of Self-Supervised Learning Models in Visual Recognition Tasks

Euihyun Yoon, Hyunjong Lee, Donggeon Kim, Joochan Park, Jinkyu Kim, Jaekoo Lee

http://doi.org/10.5626/JOK.2024.51.7.609

Recently, the field of teacher-based artificial intelligence (AI) has been rapidly advancing. However, teacher-based learning relies on datasets with specified correct answers, which can increase the cost of obtaining these correct answers. To address this issue, self-supervised learning, which can learn general features of photos without needing correct answers, is being researched. In this paper, various self-supervised learning models were classified based on their learning methods and backbone networks. Their strengths, weaknesses, and performances were then compared and analyzed. Photo classification tasks were used for performance comparison. For comparing the performance of transfer learning, detailed prediction tasks were also compared and analyzed. As a result, models that only used positive pairs achieved higher performance by minimizing noise than models that used both positive and negative pairs. Furthermore, for fine-grained predictions, methods such as masking images for learning or utilizing multi-stage models achieved higher performance by additionally learning regional information.

Graph Structure Learning-Based Neural Network for ETF Price Movement Prediction

Hyeonsoo Jo, Jin-gee Kim, Taehun Kim, Kijung Shin

http://doi.org/10.5626/JOK.2024.51.5.473

Exchange-Traded Funds (ETFs) are index funds that mirror particular market indices, usually associated with their low risk and expense ratio to individual investors. Various methods have emerged for accurately predicting ETF price movements, and recently, AI-based technologies have been developed. One representative method involves using time-series-based neural networks to predict the price movement of ETFs. This approach effectively incorporates past price information of ETFs, allowing the prediction of their movement. However, it has a limitation as it only utilizes historical information of individual ETFs and does not account for the relationships and interactions between different ETFs. To address this issue, we propose a model that can capture relationships between ETFs. The proposed model uses graph structure learning to infer a graph representing relationships between ETFs. Based on this, a graph neural network predicts the ETF price movement. The proposed model demonstrates superior performance compared to time-series-based deep-learning models that only use individual ETF information.

CSDVirt: An Emulator for Computational Storage Device

Ilkueon Kang, Jaehoon Shim, Jin-Soo Kim

http://doi.org/10.5626/JOK.2024.51.1.1

Since Computational Storage Device (CSD) concept was proposed, various forms of CSDs have been presented in both academia and industries. The standardization of CSD interfaces is currently undergoing, but they are still in a very early stage. As a result, the existing CSD proposals lack uniformity in interfaces and internal device architectures. This has led to significant engineering efforts for CSD research. In this paper, we propose CSDVirt to facilitate the CSD research and provide an environment similar to actual devices. CSDVirt is an emulator that offers CSD functionalities using NVMeVirt. With CSDVirt, the characteristics of various workloads on CSDs can be evaluated easily.

Prediction of Toothbrushing Position Based on Gyro Sensor Data and its Validation Using Unsupervised Learning-based Clustering

DoYoon Kim, MinWook Kwon, SeungJu Baek, HyeRin Yoon, DaeYeon Lim, Eunah Jo, Seungjae Ryu, Young Wook Kim, Jin Hyun Kim

http://doi.org/10.5626/JOK.2023.50.12.1143

Oral health is an important health indicator that is directly related to longevity. For this reason, oral health has become a key component of public health, from infants to the elderly. The foundation of good oral health is good brushing habits. However, the recommended correct brushing method is not easy to adopt, and this harms oral health. This paper proposes a method to distinguish brushing zones using low-cost IMU sensors to track the correct brushing method. We evaluated the accuracy of the brushing zone estimation method using clustering algorithms in machine learning. In this paper, we propose a method for determining the brushing area based on toothbrush posture alone using the gyro sensor of an IMU sensor. In this paper, we propose a method for determining the brushing area using only the gyro sensor of an IMU sensor based on toothbrush posture. We showed that relatively inexpensive 6-axis IMU gyro sensor data could be used to estimate the user’s brushing area with an accuracy of 80.6%. In addition, we applied a clustering algorithm to these data and trained a logistic regression model using the clustered data to estimate the brushing area. The result was obtained with an accuracy of 86.7%, showing that clustering was effective and that the toothbrush posture-based brushing area estimation proposed in this paper was effective. In conclusion, it is expected that the brushing zone estimation algorithm can be implemented as a function of a relatively low-cost toothbrush and that it can help to maintain oral health by analyzing and improving personal brushing habits.

Change Description Difference Analysis between Human and Code Differencing Techniques

Moojun Kim, Beomcheol Kim, Jindae Kim

http://doi.org/10.5626/JOK.2023.50.2.150

This study investigated the difference between descriptions of code changes made by source code differencing tools and humans. We applied two popular source code differencing techniques to collected code changes. We found that these tools often generated different descriptions for the same changes, and only 3% of the changes have the same descriptions from both tools. On the other hand, human participants agree on change descriptions for 50% of the given changes. Furthermore, many of the different descriptions were caused by simple mistakes. If we ignore differences caused by these mistakes, human participants described 71% of the changes similarly. We also compared change type and entity type of edit scripts generated by human and the source code differencing techniques for the same changes. We found that the techniques generated the same description as humans for only 8.20～35.65% of the changes, which indicates that these techniques require significant improvement to provide descriptions similar to human’s.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Journal of KIISE

Journal of KIISE

Digital Library[ Search Result ]

Pretrained Large Language Model-based Drug-Target Binding Affinity Prediction for Mutated Proteins

SPI: Similar Patch Identifier for Automated Program Repair

EnhPred: Deep Learning Model for Precise Prediction of Enhancer Positions

Task-Oriented Dialogue System Using a Fusion Module between Knowledge Graphs

Managing DISCARD Commands in F2FS File System for Improving Lifespan and Performance of SSD Devices

A Survey of Advantages of Self-Supervised Learning Models in Visual Recognition Tasks

Graph Structure Learning-Based Neural Network for ETF Price Movement Prediction

CSDVirt: An Emulator for Computational Storage Device

Prediction of Toothbrushing Position Based on Gyro Sensor Data and its Validation Using Unsupervised Learning-based Clustering

Change Description Difference Analysis between Human and Code Differencing Techniques

Search

Editorial Office