Search : [ keyword: 트랜스포머 ] (24)

A Pretrained Model-Based Approach to Improve Generalization Performance for ADMET Prediction of Drug Candidates

Yoonju Kim, Sanghyun Park

http://doi.org/10.5626/JOK.2025.52.7.601

Accurate prediction of Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) properties plays an important role in reducing clinical trial failure rates and lowering drug development costs. In this study, we propose a novel method to improve ADMET prediction performance for drug candidate compounds by integrating molecular embeddings from a graph transformer model with pretrained embeddings from a UniMol model. The proposed model can capture bond type information from molecular graph structures, generating chemically refined representations, while leveraging UniMol’s pretrained 3D embeddings to effectively learn spatial molecular characteristics. Through this, the model is designed to address the problem of data scarcity and enhance the generalization performance. In this study, we conducted prediction experiments on 10 ADMET properties. The experiment results demonstrated that our proposed model outperformed existing methods and that the prediction accuracy for ADMET properties could be improved by effectively integrating atomic bond information and 3D structures.

Maximizing UAV Data Efficiency in NextG Networks: A Transformer-Based mmWave Beamforming Approach

Avi Deb Raha, Apurba Adhikary, Mrityunjoy Gain, Yu Qiao, Hyeonsu Kim, Jisu Yoon, Choong Seon Hong

http://doi.org/10.5626/JOK.2025.52.2.170

Beamforming is essential in the rapidly evolving field of next generation (NextG) wireless communication, particularly when leveraging terahertz and millimeter-wave (mmWave) frequency bands to achieve ultra-high data speeds. However, these frequency bands present challenges, particularly concerning the costs associated with beam training, which can hinder Ultra-Reliable Low-Latency Communication (URLLC) in high-mobility applications, such as drone and Unmanned Aerial Vehicle (UAV) communications. This paper proposes a contextual information-based mmWave beamforming approach for UAVs and formulates an optimization problem aimed at maximizing data rates in high-mobility UAV scenarios. To predict optimal beams while ensuring URLLC, we have developed a lightweight transformerThe self-attention mechanism of the transformer allows the model to focus selectively on the most important features of the contextual information. This lightweight transformer model effectively predicts the best beams, thereby enhancing the data rates of UAVs. Simulation results demonstrate the design's effectiveness, as the lightweight transformer model significantly outperforms baseline methods, achieving up to 17.8% higher Top-1 beam accuracies and reducing average power loss by as much as 96.79%. Improvements range from 12.49% to 96.79% relative to baseline methods.

Topic-Aware Cross-Attention for Dialogue Summarization

Suyoung Min, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.11.1011

Unlike general document summarization, dialogue summarization frequently involves informal and colloquial language. It requires an understanding of the context and flow of the dialogue. It also requires consideration of topics. This study proposes a Topic-Aware Cross-Attention mechanism that can incorporate elements to recognize topic distributions into a cross-attention mechanism to reflect characteristics of dialogue. This Topic-Aware Cross-Attention mechanism can extract topic distributions of dialogue and summary and apply the similarity of these distributions to the cross-attention mechanism within BART model’s decoder to perform dialogue summarization. The proposed Topic-Aware Cross-Attention mechanism can adjust application degree of topic distribution similarity to the cross-attention mechanism by modifying topic-ratio. Experimental results on DialogSum and SAMSum datasets demonstrated the suitability of the method for dialogue summarization.

A Survey of Advantages of Self-Supervised Learning Models in Visual Recognition Tasks

Euihyun Yoon, Hyunjong Lee, Donggeon Kim, Joochan Park, Jinkyu Kim, Jaekoo Lee

http://doi.org/10.5626/JOK.2024.51.7.609

Recently, the field of teacher-based artificial intelligence (AI) has been rapidly advancing. However, teacher-based learning relies on datasets with specified correct answers, which can increase the cost of obtaining these correct answers. To address this issue, self-supervised learning, which can learn general features of photos without needing correct answers, is being researched. In this paper, various self-supervised learning models were classified based on their learning methods and backbone networks. Their strengths, weaknesses, and performances were then compared and analyzed. Photo classification tasks were used for performance comparison. For comparing the performance of transfer learning, detailed prediction tasks were also compared and analyzed. As a result, models that only used positive pairs achieved higher performance by minimizing noise than models that used both positive and negative pairs. Furthermore, for fine-grained predictions, methods such as masking images for learning or utilizing multi-stage models achieved higher performance by additionally learning regional information.

Efficient Prompt Learning Method in Blurry Class Incremental Learning Environment

Yunseok Oh, Dong-Wan Choi

http://doi.org/10.5626/JOK.2024.51.7.655

Continual learning is the process of continuously integrating new knowledge to maintain performance across a sequence of tasks. While disjoint continual learning, which assumes no overlap between classes across tasks, blurry continual learning addresses more realistic scenarios where overlaps do exist. Traditionally, most related works have predominantly focused on disjoint scenarios and recent attention has shifted towards prompt-based continual learning. This approach uses prompt mechanism within a Vision Transformer (ViT) model to improve adaptability. In this study, we analyze the effectiveness of a similarity function designed for blurry class incremental learning, applied within a prompt-based continual learning framework. Our experiments demonstrate the success of this method, particularly in its superior ability to learn from and interpret blurry data.

A GRU-based Time-Series Forecasting Method using Patching

Yunyeong Kim, Sungwon jung

http://doi.org/10.5626/JOK.2024.51.7.663

Time series forecasting plays a crucial role in decision-making within various fields. Two recent approaches, namely, the patch time series Transformer (PatchTST) and the long-term time series foraging linear (LTSF-Linear) of the MLP structure have shown promising performance in this area. However, PatchTST requires significant time for both model training and inference, while LTSF-Linear has limited capacity due to its simplistic structure. To address these limitations, we propose a new approach called patch time series GRU (PatchTSG). By leveraging a Gated Recurrent Unit (GRU) on the patched data, PatchTSG reduces the training time and captures valuable information from the time series data. Compared to PatchTST, PatchTSG achieves an impressive reduction in learning time (up to 82%) and inference time (up to 46%).

A Token Selection Method for Effective Token Pruning in Vision Transformers

Jaeyeon Lee, Dong-Wan Choi

http://doi.org/10.5626/JOK.2024.51.6.567

The self-attention-based models, vision transformers, have recently been employed in the field of computer vision. While achieving excellent performance in a variety of tasks, the computation costs increase in proportion to the number of tokens during inference, which causes a degradation in inference speed. Especially when deploying the model in real-world scenarios, many limitations could be encountered. To address this issue, we propose a new token importance measurement, which can be obtained by modifying the structure of multi-head self-attention in vision transformers. By pruning less important tokens through our method during inference, we can improve inference speed while preserving performance. Furthermore, our proposed method, which requires no additional parameters, exhibits better robustness without fine-tuning and demonstrates that it can maximize performance when integrated with existing token pruning methods.

SCA: Improving Document Grounded Response Generation based on Supervised Cross-Attention

Hyeongjun Choi, Seung-Hoon Na, Beomseok Hong, Youngsub Han, Byoung-Ki Jeon

http://doi.org/10.5626/JOK.2024.51.4.326

Document-grounded response generation is the task of aiming at generating conversational responses by “grounding” the factual evidence on task-specific domain, such as consumer consultation or insurance planning, where the evidence is obtained from the retrieved relevant documents in response to a user’s question under the current dialogue context. In this study, we propose supervised cross-attention (SCA) to enhance the ability of the response generation model to find and incorporate “response-salient snippets” (i.e., spans or contents), which are parts of the retrieved document that should be included and maintained in the actual answer generation. SCA utilizes the additional supervised loss that focuses cross-attention weights on the response-salient snippets, and this attention supervision likely enables a decoder to effectively generate a response in a “saliency-grounding” manner, by strongly attending to the important parts in the retrieved document. Experiment results on MultiDoc2Dial show that the use of SCA and additional performance improvement methods leads to the increase of 1.13 in F1 metric over the existing SOTA, and reveals that SCA leads to the increase of 0.25 in F1.

SASRec vs. BERT4Rec: Performance Analysis of Transformer-based Sequential Recommendation Models

Hye-young Kim, Mincheol Yoon, Jongwuk Lee

http://doi.org/10.5626/JOK.2024.51.4.352

Sequential recommender systems extract interests from user logs and use them to recommend items the user might like next. SASRec and BERT4Rec are widely used as representative sequential recommendation models. Existing studies have utilized these two models as baselines in various studies, but their performance is not consistent due to differences in experimental environments. This research compares and analyzes the performance of SASRec and BERT4Rec on six representative sequential recommendation datasets. The experimental result shows that the number of user-item interactions has the largest impact on BERT4Rec training, which in turn leads to the performance difference between the two models. Furthermore, this research finds that the two learning methods, which are widely utilized in sequential recommendation settings, can also have different effects depending on the popularity bias and sequence length. This shows that considering dataset characteristics is essential for improving recommendation performance.

Model Architecture Analysis and Extension for Improving RF-based Multi-Person Pose Estimation Performance

SeungHwan Shin, Yusung Kim

http://doi.org/10.5626/JOK.2024.51.3.262

An RF-based multi-person pose estimation system can estimate each human posture even when it is challenging to obtain clear visibility due to obstacles or lighting conditions. Traditionally, a cross-modal teacher-student learning approach has been employed. The approach utilizes pseudo-label data acquired by using images captured concurrently with RF signal collection as input for a pretrained image-based pose estimation model. In a previous research study, the research team applied cross-modal knowledge distillation to mimic the feature maps of image-based learning models and referred to it as "visual cues." This enhanced the performance of RF signal-based pose estimation. In this paper, performance is compared based on the ratio at which the learned visual cues are concatenated, and an analysis of the impact of segmentation mask learning and the use of multiframe inputs on multi-person pose estimation performance is presented. It is demonstrated that the best performance is achieved when visual cues and multiframe inputs are used in combination.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr