Search : [ keyword: prediction ] (66)

An Effective Comparative Framework for Cross-Project Defect Prediction Based on the Feature Selection Technique

Duksan Ryu, Jongmoon Baik

http://doi.org/10.5626/JOK.2018.45.7.635

Software defect prediction (SDP) can help optimally allocate software testing resources on fault-prone modules. Typically, local data within a company are used to build classifiers. Unlike such Within-Project Defect Prediction (WPDP), there may exist some cases, e.g., pilot projects, without any collected data from historical projects. Cross-project defect prediction (CPDP) using data from other projects can be employed in such cases. The defect prediction performance may be degraded in the presence of irrelevant or redundant information. To address this issue, various feature selection techniques have been suggested. Until now, there has been no research on identifying effective feature selection techniques for CPDP. We present a comparative framework using feature selection to produce a high performance for CPDP. We compare eight existing feature selection techniques, for three CPDP and one WPDP model, based on feature subset evaluators and feature ranking methods. After the features are chosen that perform the best, classifiers are built, tested, and evaluated using the statistical significance and effect size tests. Hybrid Instance Selection using Nearest-Neighbor (HISNN) is better than the other CPDP models and comparable to the WPDP model. Results from the comparison show that a different distribution, class imbalance and feature selection should be considered to obtain a high performance CPDP model.

Pretrained Large Language Model-based Drug-Target Binding Affinity Prediction for Mutated Proteins

Taeung Song, Jin Hyuk Kim, Hyeon Jun Park, Jonghwan Choi

http://doi.org/10.5626/JOK.2025.52.6.539

Drug development is a costly and time-consuming process. Accurately predicting the impact of protein mutations on drug-target binding affinity remains a major challenge. Previous studies have utilized long short-term memory (LSTM) and transformer models for amino acid sequence processing. However, LSTMs suffer from long-sequence dependency issues, while transformers face high computational costs. In contrast, pretrained large language models (pLLMs) excel in handling long sequences, yet prompt-based approaches alone are insufficient for accurate binding affinity prediction. This study proposed a method that could leverage pLLMs to analyze protein structural data, transform it into embedding vectors, and use a separate machine learning model for numerical binding affinity prediction. Experimental results demonstrated that the proposed approach outperformed conventional LSTM and prompt-based methods, achieving lower root mean square error (RMSE) and higher Pearson correlation coefficient (PCC), particularly in mutation-specific predictions. Additionally, performance analysis of pLLM quantization confirmed that the method maintained sufficient accuracy with reduced computational cost.

A Graph Neural Network Approach for Predicting the Lung Carcinogenicity of Single Molecular Compounds

Yunju Song, Sunyong Yoo

http://doi.org/10.5626/JOK.2025.52.6.482

Cancer is one of the major diseases causing millions of deaths worldwide every year, and lung cancer has been recorded as the leading cause of cancer-related deaths in Korea in 2022. Therefore, research on lung cancer-causing compounds is essential, and this study proposes and evaluates a novel approach to predict lung cancer-causing potential using graph neural networks to overcome the limitations of existing machine learning and deep learning methods. Based on SMILES(Simplified Molecular Input Line Entry System) information from the compound carcinogenicity databases CPDB, CCRIS, IRIS and T3DB, the structure and chemical properties of molecules were converted into graph data for training, and the proposed model showed superior prediction performance compared to other models. This demonstrates the potential of graph neural networks as an effective tool for lung cancer prediction and suggests that they can make important contributions to future cancer research and treatment development.

A Diffusion-based Trajectory Prediction Model for Flight Vehicles Considering Pull-up Maneuvers

Seonggyun Lee, Joonseong Kang, Jeyoon Yeom, Dongwg Hong, Youngmin Kim, Kyungwoo Song

http://doi.org/10.5626/JOK.2025.52.3.241

This paper proposes a new model for processing multivariate time series data aimed at predicting nonlinear trajectories related to aircraft pull-up maneuvers. To achieve this, aircraft trajectories were predicted using CSDI (Conditional Score-based Diffusion Models for Imputation), a state-of-the-art generative AI model. Specifically, because the flight distance and shape of the aircraft vary significantly depending on the presence of pull-up maneuvers, the data were separated into subsets with and without these maneuvers to train and predict distinct models. Experimental results demonstrated that the model predicted trajectories very similar to actual trajectories and achieved superior performance in MAE, RMSE, and CRPS metrics compared to existing deep learning models. This study not only enhances the accuracy of aircraft trajectory prediction but also suggests the potential for more sophisticated predictions through future integration with Classifier Diffusion models.

Drug Toxicity Prediction Using Integrated Graph Neural Networks and Attention-Based Random Walk Algorithm

Jong-Hoon Park, Jae-Woo Chu, Young-Rae Cho

http://doi.org/10.5626/JOK.2025.52.3.234

The traditional drug development process is often burdened by high costs and lengthy timelines, leading to increasing interest in AI-based drug development. In particular, the importance of AI models for preemptively evaluating drug toxicity is being emphasized. In this study, we propose a novel drug toxicity prediction model, named Integrated GNNs and Attention Randon Walk (IG-ARW). The proposed method integrates various Graph Neural Network (GNN) models and uses attention mechanisms to compute random walk transition probabilities, extracting graph features precisely. The model then conducts random walks to extract node features and graph features, ultimately predicting drug toxicity. IG-ARW was evaluated on three different datasets, demonstrating strong performances with AUC scores of 0.8315, 0.8894, and 0.7476, respectively. Notably, the model was proven to be highly effective not only in toxicity prediction, but also in predicting other drug characteristics.

Improved Software Defect Prediction with Gated Tab Transformer

Saranya Manikandan, Duksan Ryu

http://doi.org/10.5626/JOK.2025.52.3.196

Software Defect Prediction (SDP) plays a crucial role in ensuring software quality and reliability. Although, traditional machine learning and deep learning models are widely used for SDP, recent advancements in the field of natural language processing have paved the way for applying transformer-based models in software engineering tasks. This paper investigated transformer-based model as a potential approach to improve SDP model quality, ultimately aiming to enhance software quality and optimize testing resource allocation. Inspired by the Gated Tab Transformer’s (GTT) ability to effectively model relationship within features, we evaluated its effectiveness in SDP. We conducted experiments using 15 software defect datasets and compared results with other state-of-the-art machine learning and deep learning models. Our experiments showed that GTT outperformed state-of-the-art machine learning models in terms of recall, balance, and AUC (increase by 42.1%, 10.93%, and 7.1%, respectively). Cohen's d confirmed this advantage with large and medium effect sizes for GTT on these metrics. Additionally, an ablation study assessed the impact of hyperparameter variations on performance. Thus, GTT's effectiveness address the challenges of SDP, potentially leading to more effective testing resource allocation and improved software quality.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr