A Software-based Secure Disaggregated Memory System on Commodity Servers

Yewon Yong, Taehoon Kim, Sungho Lee, Changdae Kim

http://doi.org/10.5626/JOK.2024.51.9.757

A disaggregated memory system is a technology that consolidates memory from multiple servers. While this technique provides large amounts of memory for applications, it also poses serious security threats due to sensitive data transmission between servers. Several studies have addressed this issue by relying on specialized hardware. However, the use of such hardware introduces not only additional costs but also challenges in adopting it on commercial servers because of compatibility issues. In this paper, we propose a software-based mechanism to ensure the security of disaggregated memory systems. Our approach aims to prevent security threats by performing encryption and integrity verification on data transmitted between servers within a disaggregated memory system. To minimize the performance overhead associated with software implementation, our approach overlaps data transmission and decryption, and encrypts only private data. In addition, we optimize the size of encryption metadata to reduce memory overhead. Through empirical evaluations, we demonstrate that our proposed software-based security mechanism incurs negligible additional performance overhead, particularly when the performance overhead from the disaggregated memory system is already minimal.

Deep Learning-Based Abnormal Event Recognition Method for Detecting Pedestrian Abnormal Events in CCTV Video

Jinha Song, Youngjoon Hwang, Jongho Nang

http://doi.org/10.5626/JOK.2024.51.9.771

With increasing CCTV installations, the workload for monitoring has significantly increased. However, a growing workforce has reached its limits in addressing this issue. To overcome this problem, intelligent CCTV technology has been developed. However, this technology experiences performance degradation in various situations. This paper proposes a robust and versatile method for integrated abnormal behavior recognition in CCTV footage that could be applied in multiple situations. This method could extract frame images from videos to use raw images and heatmap representation images as inputs. It could remove feature vectors through merging methods at both image and feature vector levels. Based on these vectors, we proposed an abnormal behavior recognition method utilizing 2D CNN models, 3D CNN models, LSTM, and Average Pooling. We defined minor classes for performance validation and generated 1,957 abnormal behavior video clips for testing. The proposed method is expected to improve the accuracy of abnormal behavior recognition through CCTV footage, thereby enhancing the efficiency of security and surveillance systems.

A Comparative Study on Server Allocation Optimization Algorithms for Accelerating Parallel Training of Large Language Models

Jinkyu Yim, Yerim Choi, Jinho Lee

http://doi.org/10.5626/JOK.2024.51.9.783

As large-scale language models (LLMs) come to be increasingly utilized in various fields, there is an increasing demand to develop models with higher performance. Significant computational power and memory capacity will be needed to train such models. Therefore, researchers have used 3D parallelization methodology for large-scale language model learning on numerous servers equipped with GPUs. However, 3D parallelization requires frequent large-scale data transfers between servers, which bottlenecks the overall training time. To address this, prior studies have proposed a methodology that identifies non-uniform cluster network conditions in advance and arranges servers and GPUs in an optimized parallel configuration. The existing methods of this type use the classical optimization algorithm SA (Simulated Annealing) for mapping. In this paper, we apply genetic algorithms as well as SAT(satisfiability) algorithms to the problem, and compare and analyze the performance of each algorithm under various experimental environments.

Octave-YOLO: Direct Multi-scale Feature Fusion for Object Detection

Sangjune Shin, Dongkun Shin

http://doi.org/10.5626/JOK.2024.51.9.792

In object detection research, multiscale feature fusion—combining feature maps of different scales to detect objects of varying sizes—has become a critical focus. Network structures like Feature Pyramid Networks (FPNs) and Path Aggregation Networks (PANets) have been developed to address this challenge. PANet, an enhancement of FPN, integrates both top-down and bottom-up pathways, leading to significant improvements in object detection performance. However, during multiscale feature fusion, PANet’s upscaling and downscaling processes can result in the loss of crucial low- or high-level information from the original feature maps. In this paper, we introduce the Octave C2f module, which employs octave convolution to seamlessly fuse feature maps of different sizes without the need for additional processing. This innovative approach enhances accuracy while reducing computational complexity. Experimental results on the PASCAL VOC and MS COCO datasets demonstrate improved accuracy, reduced computational effort, and a decrease in parameter count compared to the default YOLOv8 model.

Understanding Video Semantic Structure with Spatiotemporal Graph Random Walk

Hoyeoung Yun, Minseo Kim, Eun-Sol Kim

http://doi.org/10.5626/JOK.2024.51.9.801

Understanding a long video focuses on finding various semantic units present in the video and interpreting complex relationships among them. Conventional approaches utilize models based on CNNs or transformers to encode contextual information for short clips and then consider temporal relationships among them. However, such approaches struggle to capture complex relationships among smaller semantic units within video clips. In this paper, we present video inputs using a spatiotemporal graph with objects as vertices and relative space-time information between objects as edges, to explicitly express relationships among these semantic units. Additionally, we proposed a novel method to represent major semantic units as compositions of smaller units using high-order relationship information obtained by spatiotemporal random walks on the graph. Through experiments on CATER dataset, which involved complex actions of multiple objects, we demonstrated that our approach exhibited effective semantic unit capturing capabilities.

A Study on Sales Prediction Model Based on BiLSTM-GAT Using Credit Card Transaction Data

Wonseok Jung, Dohyung Kim, Young Ik Eom

http://doi.org/10.5626/JOK.2024.51.9.807

Sales prediction using credit card transaction data is essential for understanding consumer buying patterns and market trends. However, traditional statistical and machine learning models have limitations when it comes to analyzing temporal features and the relationships between different variables, such as geographical data and sales information by service types, population, and transaction times. This paper proposes two models that can simultaneously analyze the relationships based on commercial district features and sales time-series features. To evaluate the performance of these models, we constructed graphs based on the distances and sales similarity of features between commercial districts. We then compared the performance of the proposed models with traditional time-series models, namely LSTM and BiLSTM. The results of the experiment showed that the GAT-BiLSTM model improved prediction accuracy by approximately 15% compared to the BiLSTM model, while the BiLSTM-GAT model improved it by about 29% over the BiLSTM model, as measured by RMSE.

KULLM: Learning to Construct Korean Instruction-Following Large Language Models

Seungjun Lee, Yoonna Jang, Jeongwook Kim, Taemin Lee, Heuiseok Lim

http://doi.org/10.5626/JOK.2024.51.9.817

The emergence of Large Language Models (LLMs) has revolutionized the research paradigm in natural language processing. While instruction-tuning techniques have been pivotal in enhancing LLM performance, the majority of current research has focused predominantly on English. This study addresses the need for multilingual approaches by presenting a method for developing and evaluating Korean instruction-following models. We fine-tuned LLM models using Korean instruction datasets and conducted a comprehensive performance analysis using various dataset combinations. The resulting Korean instruction-following model is made available as an open-source resource, contributing to the advancement of Korean LLM research. Our work aims to bridge the language gap in LLM development and promote more inclusive AI technologies.

Information Retrieval-based Bug Localization for Korean Bug Reports using Translation

Misoo Kim

http://doi.org/10.5626/JOK.2024.51.9.827

Information retrieval-based bug localization technique uses bug reports as queries to automatically identify faulty source files, significantly reducing the time developers spend locating bugs. The core of this technique lies in calculating text similarity between bug reports and source files. However, for bug reports written in Korean, the text similarity might not be effective due to difficulty of matching words with source codes primarily written in English. This study proposed an information retrieval-based bug localization technique for Korean bug reports using translation, enabling Korean developers to effectively use this technique. We also applied a soft voting method to effectively leverage outputs of multiple translators. To validate the performance of the proposed technique, we collected 269 Korean bug reports and conducted experiments using three translators and two ranking models. Experimental results showed that the proposed method improved bug localization performance by 44% compared to baselines.

Height and Texture Modeling of Road Surfaces for Camera-Based HDMap Construction

Changhee Won, Jongwoo Lim

http://doi.org/10.5626/JOK.2024.51.9.835

With increasing demand for construction and updating of HD maps for autonomous vehicles, multi-camera systems are being used as cost-effective sensors for Mobile Mapping System (MMS). Stereo matching among multi-view images, image feature point matching, and visual localization are utilized for such camera-based 3D map reconstruction. In this paper, we proposed a methodology for estimating height of road surfaces and texture registration utilizing hexgrid model, keyframe poses and 3D point clouds based on multi-view images. The proposed methodology could reconstruct high-density and high-accuracy 3D point cloud of road surfaces base on a multi-camera system mounted on the upper part of a vehicle. Our experimental results showed that the proposed method created a precise road model with a minimum point spacing of 0.025 m on a MMS equipped with ultra-wide angle fisheye cameras and GPS.

Model Contrastive Federated Learning on Re-Identification

Seongyoon Kim, Woojin Chung, Sungwoo Cho, Yongjin Yang, Shinhyeok Hwang, Se-Young Yun

http://doi.org/10.5626/JOK.2024.51.9.841

Advances in data collection and computing power have dramatically increased the integration of AI technology into various services. Traditional centralized cloud data processing raises concerns over the exposure of sensitive user data. To address these issues, federated learning (FL) has emerged as a decentralized training method where clients train models locally on their data and send locally updated models to a central server. The central server aggregates these locally updated models to improve a global model without directly accessing local data, thereby enhancing data privacy. This paper presents FedCON, a novel FL framework specifically designed for re-identification (Re-ID) tasks across various domains. FedCON integrates contrastive learning with FL to enhance feature representation, which is crucial for Re-ID tasks that emphasize similarity between feature vectors to match identities across different images. By focusing on feature similarity, FedCON can effectively addresses data heterogeneity challenges and improve the global model's performance in Re-ID applications. Empirical studies on person and vehicle Re-ID datasets demonstrated that FedCON outperformed existing FL methods for Re-ID. Our experiments with FedCON on various CCTV datasets for person Re-ID showed superior performance to several baselines. Additionally, FedCON significantly enhanced vehicle Re-ID performance on real-world datasets such as VeRi-776 and VRIC, demonstrating its practical applicability.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr