Digital Library[ Search Result ]
A Pretrained Model-Based Approach to Improve Generalization Performance for ADMET Prediction of Drug Candidates
http://doi.org/10.5626/JOK.2025.52.7.601
Accurate prediction of Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) properties plays an important role in reducing clinical trial failure rates and lowering drug development costs. In this study, we propose a novel method to improve ADMET prediction performance for drug candidate compounds by integrating molecular embeddings from a graph transformer model with pretrained embeddings from a UniMol model. The proposed model can capture bond type information from molecular graph structures, generating chemically refined representations, while leveraging UniMol’s pretrained 3D embeddings to effectively learn spatial molecular characteristics. Through this, the model is designed to address the problem of data scarcity and enhance the generalization performance. In this study, we conducted prediction experiments on 10 ADMET properties. The experiment results demonstrated that our proposed model outperformed existing methods and that the prediction accuracy for ADMET properties could be improved by effectively integrating atomic bond information and 3D structures.
Pretrained Large Language Model-based Drug-Target Binding Affinity Prediction for Mutated Proteins
Taeung Song, Jin Hyuk Kim, Hyeon Jun Park, Jonghwan Choi
http://doi.org/10.5626/JOK.2025.52.6.539
Drug development is a costly and time-consuming process. Accurately predicting the impact of protein mutations on drug-target binding affinity remains a major challenge. Previous studies have utilized long short-term memory (LSTM) and transformer models for amino acid sequence processing. However, LSTMs suffer from long-sequence dependency issues, while transformers face high computational costs. In contrast, pretrained large language models (pLLMs) excel in handling long sequences, yet prompt-based approaches alone are insufficient for accurate binding affinity prediction. This study proposed a method that could leverage pLLMs to analyze protein structural data, transform it into embedding vectors, and use a separate machine learning model for numerical binding affinity prediction. Experimental results demonstrated that the proposed approach outperformed conventional LSTM and prompt-based methods, achieving lower root mean square error (RMSE) and higher Pearson correlation coefficient (PCC), particularly in mutation-specific predictions. Additionally, performance analysis of pLLM quantization confirmed that the method maintained sufficient accuracy with reduced computational cost.
HAGCN: Heterogeneous Attentive GCN for Gene-Disease Association
http://doi.org/10.5626/JOK.2025.52.2.161
Predicting gene-disease associations (GDAs) is essential for understanding molecular mechanisms, diagnosing disease, and targeting genes. Validating causal relationships between diseases and genes using experimental methods can be extremely costly and time-consuming. Deep learning, particularly graph neural networks, has shown great promise in this area. However, most models rely on single-source, homogeneous graphs. Another is the need for expert knowledge in manual definition of meta-paths to build multi-source heterogeneous graphs. Recognizing these challenges, the present study introduces the Heterogeneous Attentive Graph Convolution Network (HAGCN). HAGCN processes heterogeneous biological entity association graphs as input. We construct the input graphs using the biological association information from curated databases such as Gene Ontology, Disease Ontology, Human Phenotype Ontology, and TBGA. HAGCN learns the relationship heterogeneity between biological entities without meta-paths by using the attention mechanism. HAGCN achieved the best performance in AUC-ROC in a binary classification task to predict gene-disease association, and also achieved competitive performance in F1 score, MCC, and accuracy against baselines. We believe that HAGCN can accelerate the discovery of disease-associated genes and
New Transformer Model to Generate Molecules for Drug Discovery
Yu-Bin Hong, Kyungjun Lee, DongNyenog Heo, Heeyoul Choi
http://doi.org/10.5626/JOK.2023.50.11.976
Among various generative models, recurrent neural networks (RNNs) based models have achieved state-of-the-art performance in the drug generation task. To overcome the long-term dependency problem that RNNs suffer from, Transformer-based models were proposed for the task. However, the Transformer models showed worse performances than the RNNs models in the drug generation task, and we believe it was because the Transformer models were over-parameterized with the over-fitting problem. To avoid the problem, in this paper, we propose a new Transformer model by replacing the large decoder with simple feed-forward layers. Experiments confirmed that our proposed model outperformed the previous state-of-the-art baseline in major evaluation metrics while preserving other minor metrics with a similar level of performance. Furthermore, when we applied our model to generate candidate molecules against SARs-CoV-2 (COVID-19) virus, the generated molecules were more effective than drugs in commercial market such as Paxlovid, Molnupiravir, and Remdesivir.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr