Star Partitions of a Simple Polygon

An Hyo Jeong, Kim Hwi, Ahn Hee Kap

http://doi.org/10.5626/JOK.2024.51.12.1037

A polygon is star-shaped if there exists a point in the polygon that sees every point in the polygon. A star partition of a simple polygon is a partition of the simple polygon into star-shaped subpolygons, obtained by introducing diagonals inside the polygon. Each diagonal connects two distinct vertices of the polygon. Two distinct diagonals may share an endpoint, they do not intersect at any point other than their endpoints. Given a simple n-gon P and a positive integer k, our algorithm determines whether there exists a star partition S of P with |S| ≤ k. The algorithm runs in O(n2k-2) time using O(n) space. Previously, it was known that a minimum size star partition of a simple polygon can be computed in O(n7log n) time, when k is not given as an input. Thus, our algorithm runs faster than the known algorithm when OPT ≤ 4, where OPT=|S*| for the minimum size star partition S*of P.

Prompt Tuning For Korean Aspect-Based Sentiment Analysis

Bong-Su Kim, Seung-Ho Choi, Si-hyun Park, Jun-Ho Wang, Ji-Yoon Kim, Hyun-Kyu Jeon, Jung-Hoon Jang

http://doi.org/10.5626/JOK.2024.51.12.1043

Aspect-based sentiment analysis examines how emotions in text relate to specific aspects, such as product characteristics or service features. This paper presents a comprehensive methodology for applying prompt tuning techniques to multi-task token labeling challenges using aspect-based sentiment analysis data. The methodology includes a pipeline for identifying emotion expression domains, which generalizes the token labeling problem into a sequence labeling problem. It also suggests selecting templates to classify separated sequences based on aspects and emotions, and expanding label words to align with the dataset’s characteristics, thus optimizing the model's performance. Finally, the paper provides several experimental results and analyses for the aspect-based sentiment analysis task in a few-shot setting. The constructed data and baseline model are available on AIHUB. (www.aihub.or.kr).

Korean Dependency Parsing Using Sequence Labeling

Keunha Kim, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.12.1053

Dependency parsing is a crucial step in language analysis. It identifies relationships between words within a sentence. Recently, many models based on a pre-trained transformer have shown impressive performances in various natural language processing research. hey have been also applied to dependency parsing. Generally, traditional approaches to dependency parsing using pre-trained models consist of two main stages: 1) merging token-level embeddings generated by the pre-trained model into word-level embeddings; and 2) analyzing dependency relations by comparing or classifying the merged embeddings. However, due to a large number of parameters and additional layers required for embedding construction, comparison, and classification, these models can be inefficient in terms of time and memory usage. This paper proposes a dependency parsing technique based on sequential labeling to improve the efficiency of training and inference by defining dependency parsing units and simplifying model layers. The proposed model eliminates the necessity of the word-level embedding merging step by utilizing special tokens to define parsing units. It also effectively reduces the number of parameters by simplifying model layers. As a result, the training and inference time is significantly shortened. With these optimizations, the proposed model maintains meaningful performance in dependency parsing.

Detecting CCTV Traffic Accidents and Automating Emergency Rescue Based on Deep Learning

Changhoon Park, Jihyeon Kim, Inhee Cho, Sunho Jang, Kihag Kwon

http://doi.org/10.5626/JOK.2024.51.12.1061

This paper presents a novel approach to real-time detection of traffic accidents using CCTV footage and provision of immediate information about nearby hospitals. By ensemble Densenet121 and YOLOv8 models, the proposed system effectively identified the occurrence and type of traffic accidents. Based on accident location, the system searched for the nearest available emergency rooms and confirmed their capacity in real time. This enabled a prompt delivery of accident details and hospital information to the user, addressing issues of delayed reporting and inefficient allocation of emergency room resources. This approach aims to reduce initial response time during traffic accidents, thereby maximizing the efficiency of emergency medical services and ultimately minimizing accident-related harm. Specifically, Densenet121's deep neural network architecture effectively classified accident scenes in the footage, while YOLOv8's object detection algorithm identified accident types in real-time.

Explainable Video Search System using Token Space-based Representation

Jiyeol Park, Dooyoung Kim, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.12.1068

Query-video retrieval is a field that finds the most relevant video to the query input by the user. For this, existing studies have presented the query and video in an appropriate latent vector space. However, the method of calculating the relevance between the query and the video simply uses the dot product of the two vectors without implying the meaning or explainability. In this paper, we propose a model that converts the query and video into embeddings located in a token-based space, searches the video like a document, and calculates semantic similarity. Experimental results show that the performance of the final model proposed in this paper is improved in Recall@1, Recall@5, and Recall@10 compared to baseline on MSVD dataset. Furthermore, the proposed model is approximately 3.33 times faster than CLIP4Clip. When applying BM25 with minimal modifications, it achieves a speedup of about 208.11 times. Additionally, qualitative evaluations demonstrate that tokens extracted from videos exhibit relevance comparable to subtitles, proving an explainability-based structure.

Spatio-Temporal Modeling via Adaptive Frequency Filtering for Video Action Recognition

Minji Kim, Taehoon Kim, Jonghyeon Seon, Bohyung Han

http://doi.org/10.5626/JOK.2024.51.12.1078

Modeling long-term spatio-temporal dependencies in video data is challenging, as CNNs often struggle to capture global context through their local receptive fields. To address this problem, we propose an efficient global spatio-temporal modeling method that integrates easily with existing CNN models. Our approach utilizes Discrete Cosine Transform (DCT) to shift information into the frequency domain, where two adaptive filtering paths operate complementarily: one removes redundant frequencies while preserving essential information, and the other enhances important frequencies for spatio-temporal modeling. We introduce DynamicMNIST, a lightweight dataset featuring various digit behaviors like shifting, rotating, and scaling. Our evaluations on three public benchmarks and DynamicMNIST demonstrate that the proposed module enhances activity recognition performance across different CNN models with minimal additional parameters and computational costs.

Development of a Software for Synthetic IR Image Generation

Chanuk Kyeong, Junyoung Shim, Youngann Woo, Sewon Kim, Wonsik Lee

http://doi.org/10.5626/JOK.2024.51.12.1088

As the importance of seekers, a critical factor in the hit rate of guided missiles, has become more prominent, interest of IR(Infrared) seekers with high object detection performance is increasing. To improve the hit rate of IR seekers, it is essential to acquire IR images in various environments. However, capturing images with real cameras is costly and has difficulty in collecting the images in diverse environments. To address these issues, methods for generating IR images by calculating and analyzing IR signals have been developed. Recently, the demand for the development of domestic IR signal analysis software has been led to prevent the leakage of technology overseas and anlayze IR signals suitable for domestic conditions. In this paper, a domestic synthetic IR image software, which can generate IR images with IR signal analysis algorithm is propsed. The software's IR signal analysis algorithm and the method for generating synthetic IR image are described in detail, and a parallel processing method to increase the speed of synthetic IR image generation is discussed. From the simulation result, the synthetic IR images generated by the proposed software are confirmed.

A Study on a 3D Convolution-Based Video Recognition System for Driving Aggressiveness Recognition

Sangin Lee, Jihun Park

http://doi.org/10.5626/JOK.2024.51.12.1094

This study aims to develop and test a model for classifying driving styles and recognizing driving aggressiveness using video data collected from a vehicle's front camera. To achieve this, the CARLA simulator was employed to simulate aggressive and cautious driving behaviors across various road environments, while a 3D convolution-based VideoResNet model was utilized for analyzing the video data. The results showed that the trained model achieved high accuracy in classifying driving styles during urban driving scenarios, demonstrating the effectiveness of front camera data in recognizing driving aggressiveness. Furthermore, experiments confirmed the model's capability to classify driving styles in an online manner, highlighting its potential as an on-the-spot tool for recognizing driving aggressiveness. Additionally, this study investigated the effect of road environments and speed variations on aggressiveness scores, demonstrating that the model can effectively consider the interplay between road complexity and speed when makingin its predictions.

LLMEE: Enhancing Explainability and Evaluation of Large Language Models through Visual Token Attribution

Yunsu Kim, Minchan Kim, Jinwoo Choi, Youngseok Hwang, Hyunwoo Park

http://doi.org/10.5626/JOK.2024.51.12.1104

Large Language Models (LLMs) have made significant advancements in Natural Language Processing (NLP) and generative AI. However, their complex structure poses challenges in terms of interpretability and reliability. To address this issue, this study proposed LLMEE, a tool designed to visually explain and evaluate the prediction process of LLMs. LLMEE visually represents the impact of each input token on the output, enhancing model transparency and providing insights into various NLP tasks such as Summarization, Question Answering, Text Generation. Additionally, it integrates evaluation metrics such as ROUGE, BLEU, and BLEURTScore, offering both quantitative and qualitative assessments of LLM outputs. LLMEE is expected to contribute to more reliable evaluation and improvement of LLMs in both academic and industrial contexts by facilitating a better understanding of their complex workings and by providing enhanced output quality assessments.

A Study on Improving the Accuracy of Korean Speech Recognition Texts Using KcBERT

Donguk Min, Seungsoo Nam, Daeseon Choi

http://doi.org/10.5626/JOK.2024.51.12.1115

In the field of speech recognition, models such as Whisper, Wav2Vec2.0, and Google STT are widely utilized. However, Korean speech recognition faces challenges because complex phonological rules and diverse pronunciation variations hinder performance improvements. To address these issues, this study proposed a method that combined the Whisper model with a post-processing approach using KcBERT. By applying KcBERT’s bidirectional contextual learning to text generated by the Whisper model, the proposed method could enhance contextual coherence and refine the text for greater naturalness. Experimental results showed that post-processing reduced the Character Error Rate (CER) from 5.12% to 1.88% in clean environments and from 22.65% to 10.17% in noisy environments. Furthermore, the Word Error Rate (WER) was significantly improved, decreasing from 13.29% to 2.71% in clean settings and from 38.98% to 11.15% in noisy settings. BERTScore also exhibited overall improvement. These results demonstrate that the proposed approach is effective in addressing complex phonological rules and maintaining text coherence within Korean speech recognition.

Efficient Dynamic Graph Processing Based on GPU Accelerated Scheduling and Operation Reduction

Sangho Song, Jihyeon Choi, Donghyeon Cha, Hyeonbyeong Lee, Dojin Choi, Jongtae Lim, Kyoungsoo Bok, Jaesoo Yoo

http://doi.org/10.5626/JOK.2024.51.12.1125

Recent research has focused on utilizing GPUs to process large-scale dynamic graphs. However, processing dynamic graphs often leads to redundant data transmission and processing. This paper proposes an efficient scheme for processing large-scale dynamic graphs in memory-constrained GPU environments. The proposed scheme consists of dynamic scheduling and operation reduction methods. The dynamic scheduling method involves partitioning dynamic graph and maximizing GPU processing power by scheduling partitions based on active and potential active vertices. Also, snapshots are utilized to leverage the time-varying characteristics of the graph. The operation reduction method minimizes GPU computation and memory transfer costs by detecting redundant edge and vertex updates in dynamic graphs through snapshots. By avoiding redundant operations on the same edges or vertices, this method improves performance. Through various performance evaluations, the proposed scheme showed 280% and 108% performance improvements on average compared to a static graph processing scheme and a dynamic graph processing scheme, respectively.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr