Search : [ author: 김유성 ] (7)

Pseudo-label Correction using Large Vision-Language Models for Enhanced Domain-adaptive Semantic Segmentation

Jeongkee Lim, Yusung Kim

http://doi.org/10.5626/JOK.2024.51.5.464

It is very expensive to make semantic segmentation labels for real-world images. To solve this problem in unsupervised domain adaptation, the model is trained by using data generated in a virtual environment that can easily collect labels or data is already collected and real-world images without labels. One of the common problems in unsupervised domain adaptation is that thing classes with similar appearance are easily confused. In this paper, we propose a method of calibrating the label of the number of target data using large vision-language models. Making the number of labels generated for the target image more accurate can reduce confusion among thing classes. The proposed method improves the performance of DAFormer by +1.1 mIoU in adaptation from game to reality and +1.1 mIoU in adaptation from day to night. For thing classes, the proposed method improved the performance of the MIC by +0.6 mIoU in adaptation from game to reality and +0.7 mIoU in adaptation from day to night.

Model Architecture Analysis and Extension for Improving RF-based Multi-Person Pose Estimation Performance

SeungHwan Shin, Yusung Kim

http://doi.org/10.5626/JOK.2024.51.3.262

An RF-based multi-person pose estimation system can estimate each human posture even when it is challenging to obtain clear visibility due to obstacles or lighting conditions. Traditionally, a cross-modal teacher-student learning approach has been employed. The approach utilizes pseudo-label data acquired by using images captured concurrently with RF signal collection as input for a pretrained image-based pose estimation model. In a previous research study, the research team applied cross-modal knowledge distillation to mimic the feature maps of image-based learning models and referred to it as "visual cues." This enhanced the performance of RF signal-based pose estimation. In this paper, performance is compared based on the ratio at which the learned visual cues are concatenated, and an analysis of the impact of segmentation mask learning and the use of multiframe inputs on multi-person pose estimation performance is presented. It is demonstrated that the best performance is achieved when visual cues and multiframe inputs are used in combination.

Open-source-based 5G Access Network Security Vulnerability Automated Verification Framework

Jewon Jung, Jaemin Shin, Sugi Lee, Yusung Kim

http://doi.org/10.5626/JOK.2023.50.6.531

Recently, various open sources based on 5G standards have emerged, and are widely used in research to find 5G control plane security vulnerabilities. However, leveraging those open sources requires extensive knowledge of complex source code, wireless communication devices, and massive 5G security standards. Therefore, in this paper, we propose a framework for the automatic verification of security vulnerabilities in the 5G control plane. This framework builds a 5G network using commercial Software Defined Radio (SDR) equipment and open-source software and implements a Man-in-the-Middle (MitM) attacker to deploy a control plane attack test bed. It also implements control plane message decoding and correction modules to execute message spoofing attacks and automatically classifies security vulnerabilities in 5G networks. In addition, a GUI-based web user interface is implemented so that users can create MitM attack scenarios and check the verification results themselves.

1×1 UWB-based Human Pose Estimation Using Transformer

Seunghyun Kim, Keunhong Chae, Seunghwan Shin, Yusung Kim

http://doi.org/10.5626/JOK.2022.49.4.298

The problem of estimating a human’s pose in specific space from an image is one of the main area of computer vision and is an important technology that can be used in various fields such as games, medical care, disaster, fire fighting, and the military. By combining with machine learning, the accuracy of pose estimation has been greatly improved. However, the image-based approach has a limitation in that it is difficult to estimate pose when part or whole of the body is occluded by obstacles or when the lighting is dark. Recently, studies have emerged to estimate a human pose using wireless signals, which have the advantage of penetrating obstacles without being affected by brightness. The previous stereotype was that two or more pairs of transceivers are required to estimate a specific location based on wireless signals. This paper shows that it is possible to estimate the human pose and to perform body segmentation by applying deep learning only with 1x1 ultra wide band signals collected by 1×1 transceiver. We also propose a method of replacing convolution neural networks and showing better performance through transformer models.

Person Re-Identification Using an Attention Pyramid for Local Multiscale Feature Embedding Extracted from a Person’s Image

Kwangho Song, Yoo-Sung Kim

http://doi.org/10.5626/JOK.2021.48.12.1305

In this paper, a person re-identification scheme using the dual pyramid adapting attention mechanisms to extract more elaborate local feature embedding by excluding the noises caused by the unnecessary backgrounds in person’s image is proposed. With the dual pyramid of local and scale ones, the spatial attention is used to suppress the noise effects caused by unnecessary backgrounds, and the channel attention is used to emphasize the relatively important multiscale features when the local feature embedding is constructed. In the experiments, the proposed scheme was compared with other cases in which the attention module is not used for each pyramid to confirm the optimal configuration and compared based on the rank-1 accuracy with the state-of-the-art studies for the person re-identification. According to the experimental results, the proposed method showed a maximum rank-1 accuracy of 99.4%, which is higher by at least about 0.2% and at most by about 13.8% than previous works.

A Recognition of Violence Using Mobile Sensor Fusion in Intelligent Video Surveillance Systems

HyunIn Cha, KwangHo Song, Yoo-Sung Kim

http://doi.org/10.5626/JOK.2018.45.6.533

In this paper, we propose a violence recognition model by reflecting features extracted by concurrent and continuous action in intelligent CCTV through detecting group ROI(Region of Interest) from image. And then, proposed model uses extracted motion information obtained by using Dense Optical Flow algorithm in ROI and fusing of the acceleration and angular velocity information obtained from the inertial measurement unit of the mobile device possessed by actor. Experiments were performed to evaluate the reduction of the computation time of the proposed model and improvement of the performance degradation due to the occlusion. Result of experiment, the execution time was about 51 times faster and the accuracy of recognition of violence was improved by 11% compared to previous research methods. Therefore, the proposed model can overcome the problem of real-time failure due to excessive computation and can solve the problem of invisibility due to occlusion by actor in the image in recognition of violence.

Automatic Keyword Extraction using Hierarchical Graph Model Based on Word Co-occurrences

KwangHo Song, Yoo-Sung Kim

http://doi.org/

Keyword extraction can be utilized in text mining of massive documents for efficient extraction of subject or related words from the document. In this study, we proposed a hierarchical graph model based on the co-occurrence relationship, the intrinsic dependency relationship between words, and common sub-word in a single document. In addition, the enhanced TextRank algorithm that can reflect the influences of outgoing edges as well as those of incoming edges is proposed. Subsequently a novel keyword extraction scheme using the proposed hierarchical graph model and the enhanced TextRank algorithm is proposed to extract representative keywords from a single document. In the experiments, various evaluation methods were applied to the various subject documents in order to verify the accuracy and adaptability of the proposed scheme. As the results, the proposed scheme showed better performance than the previous schemes.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr