Journal of KIISE

Search : [ keyword: Attention ] (55)

Unlike general document summarization, dialogue summarization frequently involves informal and colloquial language. It requires an understanding of the context and flow of the dialogue. It also requires consideration of topics. This study proposes a Topic-Aware Cross-Attention mechanism that can incorporate elements to recognize topic distributions into a cross-attention mechanism to reflect characteristics of dialogue. This Topic-Aware Cross-Attention mechanism can extract topic distributions of dialogue and summary and apply the similarity of these distributions to the cross-attention mechanism within BART model’s decoder to perform dialogue summarization. The proposed Topic-Aware Cross-Attention mechanism can adjust application degree of topic distribution similarity to the cross-attention mechanism by modifying topic-ratio. Experimental results on DialogSum and SAMSum datasets demonstrated the suitability of the method for dialogue summarization.

Task-Oriented Dialogue System Using a Fusion Module between Knowledge Graphs

Jinyoung Kim, Hyunmook Cha, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.10.882

The field of Task-Oriented Dialogue Systems focuses on using natural language processing to assist users in achieving specific tasks through conversation. Recently, transformer-based pre-trained language models have been employed to enhance performances of task-oriented dialogue systems. This paper proposes a response generation model based on Graph Attention Networks (GAT) to integrate external knowledge data into transformer-based language models for more specialized responses in dialogue systems. Additionally, we extend this research to incorporate information from multiple graphs, leveraging information from more than two graphs. We also collected and refined dialogue data based on music domain knowledge base to evaluate the proposed model. The collected dialogue dataset consisted of 2,076 dialogues and 226,823 triples. In experiments, the proposed model showed a performance improvement of 13.83%p in ROUGE-1, 8.26%p in ROUGE-2, and 13.5%p in ROUGE-L compared to the baseline KoBART model on the proposed dialogue dataset.

A Token Selection Method for Effective Token Pruning in Vision Transformers

Jaeyeon Lee, Dong-Wan Choi

http://doi.org/10.5626/JOK.2024.51.6.567

The self-attention-based models, vision transformers, have recently been employed in the field of computer vision. While achieving excellent performance in a variety of tasks, the computation costs increase in proportion to the number of tokens during inference, which causes a degradation in inference speed. Especially when deploying the model in real-world scenarios, many limitations could be encountered. To address this issue, we propose a new token importance measurement, which can be obtained by modifying the structure of multi-head self-attention in vision transformers. By pruning less important tokens through our method during inference, we can improve inference speed while preserving performance. Furthermore, our proposed method, which requires no additional parameters, exhibits better robustness without fine-tuning and demonstrates that it can maximize performance when integrated with existing token pruning methods.

A Hybrid Deep Learning Model for Generating Time-series Fire Data in Underground Utility Tunnel based on Convolutional Attention TimeGAN

Joseph Ahn, Hyo-gun Yoon

http://doi.org/10.5626/JOK.2024.51.6.490

Underground utility tunnels (UUTs) play a crucial role in urban operation and management. Fires are the most common disasters in the facilities, and there is a growing demand for fire management systems using artificial intelligence (AI). However, due to the difficulty of collecting fire data for AI training, utilizing data generation models reflecting the key characteristics of real fires can be an alternative. In this paper, we propose an approach for generating AI training data based on the fire data generation model CA-TimeGAN. To collect fire simulation data for training the proposed model, we constructed a UUT in Chungbuk Ochang within the fire dynamic simulator (FDS) virtual environment. In the experiments, we compared data generated by TimeGAN and CA-TimeGAN, verifying the data quality and effectiveness. Discriminative score converged to 0.5 for both CA-TimeGAN and TimeGAN. Predictive scores improved by 66.1% compared to models trained only on simulated data and by 22.9% compared to models incorporating TimeGAN-generated data. PCA and t-SNE analyses showed that the distribution of generated data was similar to that of simulated data.

SCA: Improving Document Grounded Response Generation based on Supervised Cross-Attention

Hyeongjun Choi, Seung-Hoon Na, Beomseok Hong, Youngsub Han, Byoung-Ki Jeon

http://doi.org/10.5626/JOK.2024.51.4.326

Document-grounded response generation is the task of aiming at generating conversational responses by “grounding” the factual evidence on task-specific domain, such as consumer consultation or insurance planning, where the evidence is obtained from the retrieved relevant documents in response to a user’s question under the current dialogue context. In this study, we propose supervised cross-attention (SCA) to enhance the ability of the response generation model to find and incorporate “response-salient snippets” (i.e., spans or contents), which are parts of the retrieved document that should be included and maintained in the actual answer generation. SCA utilizes the additional supervised loss that focuses cross-attention weights on the response-salient snippets, and this attention supervision likely enables a decoder to effectively generate a response in a “saliency-grounding” manner, by strongly attending to the important parts in the retrieved document. Experiment results on MultiDoc2Dial show that the use of SCA and additional performance improvement methods leads to the increase of 1.13 in F1 metric over the existing SOTA, and reveals that SCA leads to the increase of 0.25 in F1.

Domain Generalized Fashion Object Detection using Style Augmentation and Attention

Youjin Chung, Jinah Park

http://doi.org/10.5626/JOK.2023.50.10.845

With the combination of fashion and computer vision, fashion object detection using deep learning has gained much interest. However, due to the nature of supervision, the performance of the model drops when images with different characteristics are used. We define the dataset with different characteristics and the characteristic of the domain as ‘domain’ and ‘style’, respectively, and propose a new augmentation method that mixes up the existing domain’s style to make a new style. We also use an attention method to extract important features from the images. Using a stylized fashion detection dataset, style deepfashion2, we show that the proposed method enhances performance within all domains.

TwinAMFNet : Twin Attention-based Multi-modal Fusion Network for 3D Semantic Segmentation

Jaegeun Yoon, Jiyeon Jeon, Kwangho Song

http://doi.org/10.5626/JOK.2023.50.9.784

Recently, with the increase in the number of accidents due to misrecognition in autonomous driving, interest in 3D semantic segmentation based on sensor fusion using multi-modal sensors has increased. Accordingly, this study introduces TwinAMFNet, a novel 3D semantic segmentation neural network through sensor fusion of RGB cameras and LiDAR. The proposed neural network includes a twin neural network that processes RGB images and point cloud projection images projected on a 2D coordinate plane and through an attention-based fusion module for feature step fusion in the encoder and decoder. The proposed method shows improvement of further extended object and boundary classification. As a result, the proposed neural network recorded approximately 68% performance in 3D semantic segmentation based on mIoU, and showed approximately 4.5% improved performance compared to the ones reported in the existing studies.

Non-autoregressive Korean Morphological Analysis with Word Segment Information

Seongmin Cho, Hyun-Je Song

http://doi.org/10.5626/JOK.2023.50.8.653

This paper introduces a non-autoregressive Korean morphological analyzer. The proposed morphological analyzer utilizes a transformer encoder to encode a given sentence and employs two non-autoregressive decoders for morphological analysis. Each decoder generates a morpheme sequence and a corresponding POS tag sequence, which are then combined to produce the final morphological analysis. Additionally, this paper leverages word segment information within the sentence to predict the target sequence length, mitigating performance degradation resulting from incorrect target sequence length predictions. Experimental results show that the proposed non-autoregressive Korean morphological analyzer outperforms all non-autoregressive baselines. It achieves comparable accuracy to an autoregressive Korean morphological analyzer while it performs nearly 14.76 times faster than the autoregressive Korean morphological analyzer.

A Knowledge Graph Embedding-based Ensemble Model for Link Prediction

Su Jeong Choi, Seyoung Park

http://doi.org/10.5626/JOK.2020.47.5.473

Knowledge bases often suffer from their limited applicability due to missing information in their entities and relations. Link prediction has been investigated to complete the missing information and makes a knowledge base more useful. The existing studies on link prediction often rely on knowledge graph embedding and have shown trade-off in their performance. In this paper, we propose an ensemble model for knowledge graph embedding to improve quality of link prediction. The proposed model combines multiple knowledge graph embeddings that have unique characteristics. In this way, the ensemble model is able to consider various aspects of the entries within a knowledge base and reduce the variation of accuracy depending on hyper-parameters. Our experiment shows that the proposed model outperforms other knowledge graph embedding methods by 13.5% on WN18 and FB15K dataset.

Branchpoint Prediction Using Self-Attention Based Deep Neural Networks

Hyeonseok Lee, Sungchan Kim

http://doi.org/10.5626/JOK.2020.47.4.343

Splicing is a ribonucleic acid (RNA) process of creating a messenger RNA (mRNA) translated into proteins. Branchpoints are sequence elements of RNAs essential in splicing. This paper proposes a novel method for branchpoint prediction. Identification of branchpoints involves several challenges. Branchpoint sites are known to depend on several sequence patterns, called motifs. Also, a branchpoint distribution is highly biased, imposing a class-imbalanced problem. Existing approaches are limited in that they either rely on handcrafted sequential features or ignore the class imbalance. To address those difficulties, the proposed method incorporates 1) Attention mechanisms to learn sequence-positional long-term dependencies, and 2) Regularization with triplet loss to alleviate the class imbalance. Our method is comparable to the state-of-the-art performance while providing rich interpretability on its decisions.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Journal of KIISE

Journal of KIISE

Digital Library[ Search Result ]

Topic-Aware Cross-Attention for Dialogue Summarization

Task-Oriented Dialogue System Using a Fusion Module between Knowledge Graphs

A Token Selection Method for Effective Token Pruning in Vision Transformers

A Hybrid Deep Learning Model for Generating Time-series Fire Data in Underground Utility Tunnel based on Convolutional Attention TimeGAN

SCA: Improving Document Grounded Response Generation based on Supervised Cross-Attention

Domain Generalized Fashion Object Detection using Style Augmentation and Attention

TwinAMFNet : Twin Attention-based Multi-modal Fusion Network for 3D Semantic Segmentation

Non-autoregressive Korean Morphological Analysis with Word Segment Information

A Knowledge Graph Embedding-based Ensemble Model for Link Prediction

Branchpoint Prediction Using Self-Attention Based Deep Neural Networks

Search

Editorial Office