Search : [ author: Hyun Lee ] (13)

Attention Map-Based Automatic Masking for Object Swapping in Diffusion Models

Soohyun Lee, Jongyoul Park

http://doi.org/10.5626/JOK.2025.52.4.284

latent diffusion model, stable diffusion, text-to-image model, object swapping, automatic masking AbstractDiffusion models have gained significant traction in the realm of text-to-image generation. The advent of Null-Text Inversion techniques has opened up new avenues for image editing by inverting real images into noise and applying modifications. However, most image editing methods, particularly those involving object manipulation, require user-defined masks, necessitating incorporation of an additional masking model into the pipeline. This complicates the inference process, which ideally should be streamlined within a single model. This paper proposed AutoMask, an attention-based automatic object masking method utilizing attention maps inherent in diffusion models to generate masks during the inference process. Unlike conventional approaches, AutoMask could leverage information obtained from the inversion step, eliminating the need for user intervention in masking. Experiments demonstrated the effectiveness of AutoMask in generating novel objects.

IDFusion: Joint Angle Measurement Method through Fusion of Inertial Measurement Sensor and Depth Camera

Juyeon Park, Mingyu Park, Gyumin Park, Hyun Lee

http://doi.org/10.5626/JOK.2025.52.3.208

Recent advancements in human and object recognition technologies are increasingly applied across various fields, particularly in motion detection research utilizing inertial measurement sensors and depth cameras in areas such as gaming, healthcare, and security. However, challenges such as cumulative errors and variable measurement accuracies depending on the environment persist. This study proposed IDFusion, a method that could integrate inertial measurement sensors and depth cameras for joint angle measurement, distinguishing itself through data transformation and joint angle conversion stages before fusion. Comparative analysis against using inertial measurement sensors and depth cameras individually demonstrated a superior performance of IDFusion. This technique holds promise for applications in healthcare, sports science, and human-computer interaction.

An Automated Error Detection Method for Speech Transcription Corpora Based on Speech Recognition and Language Models

Jeongpil Lee, Jeehyun Lee, Yerin Choi, Jaehoo Jang, Myoung-Wan Koo

http://doi.org/10.5626/JOK.2024.51.4.362

This research proposes a "machine-in-the-loop" approach for automatic error detection in Korean speech corpora by integrating the knowledge of CTC-based speech recognition models and language models. We experimentally validated its error detection performance through a three-step procedure that leveraged Character Error Rate (CER) from the speech recognition model and Perplexity (PPL) from the language model to identify potential transcription error candidates and verify their text labels. This research focused on the Korean speech corpus, KsponSpeech, resulting in a reduction of the character error rate on the test set from 9.44% to 8.9%. Notably, this performance enhancement was achieved even when inspecting only approximately 11% of the test data, highlighting the higher efficiency of our proposed method than a comprehensive manual inspection process. Our study affirms the potential of this efficient "machine-in-the-loop" approach for a cost-effective error detection mechanism in speech data while ensuring accuracy.

Proposal of An Intent Classification Method Using Text Augmentation Techniques and Transfer Learning

Huiwon Lee, Sungho Park, Chaewon Lee, Seunghyun Lee, Kangbae Lee

http://doi.org/10.5626/JOK.2024.51.2.141

Intent classification is the first step of task-directed chatbots and is an important phase in performance improvement. However, task-oriented chatbots are limited by a lack of data for specific domains. The purpose of this study is to solve the problem of data limitation by utilizing text augmentation techniques and transfer learning. Previously, studies using transfer learning and text augmentation techniques existed, but it was difficult to find studies applicable to various domains. This study proposes a text augmentation technique and transfer learning method applicable to various domains. For the experiment, less than 10,000, 20,000, and 30,000 data were constructed according to the ratio of actual utterance intentions in 8 domains. As a result of the experiment, although differences existed depending on the domain, it was confirmed that the method proposed in this study was excellent for all 8 domains. It was confirmed that the accuracy for the 8 domains improved by 10%, 3.4%, and 1.9%, respectively on average with the decreasing size of the training data, and the F1-Score improved by 30%, 12%, and 7.5%, respectively on average.

Deep Reinforcement Learning based MCS Decision Model

A-Hyun Lee, Hyeongho Bae, Young-Ky Kim, Chong-kwon Kim

http://doi.org/10.5626/JOK.2022.49.8.663

In wireless mobile communication systems, link adaptation techniques are used to increase channel throughput and frequency efficiency to adaptively adjust transmission parameters according to the changes in the channel state. Adaptive modulation and coding is a link adaptation technique that determines predefined modulation and coding scheme depending on the channel condition and performed based on the reported CQI from UE and HARQ feedback on packet transmission. In this paper, we propose an MCS decision model that applies deep reinforcement learning to adaptive modulation and coding. The proposed model adaptively determines the MCS level in a dynamically changing network, thereby increasing the transmission efficiency of UEs. We evaluated our proposed model through UE log-based simulations and demonstrated that our model performs much better than the existing outer loop rate control method.

Partially Collective Spatial Keyword Query Processing Based on Spatial Keyword Similarity

Ah Hyun Lee, Sehwa Park, Seog Park

http://doi.org/10.5626/JOK.2021.48.10.1142

Collective spatial keyword queries return Points of Interest (POI), which are close to the query location and contain all the presented set of keywords. However, existing studies only consider a fixed number of query keywords, which is not adequate to satisfy the user. They do not care about the preference of a partial keyword set, and a flexible keyword set needs to be selected for the preference of each POI. We thus propose a new query, called Partially Collective Spatial Keyword Query, which flexibly considers keywords that fit the preference for each POI. Since this query is a combinatorial optimization problem, the query processing time increases rapidly as the number of POIs increases. Therefore, to address these problems, we propose a keyword-based search technique that reduces the overall search space. Furthermore, we propose heuristic techniques, which include the linear search-based terminal node pruning technique, approximation algorithm, and threshold-based pruning technique.

CNN-based Speech Emotion Recognition Model Applying Transfer Learning and Attention Mechanism

Jung Hyun Lee, Ui Nyoung Yoon, Geun-Sik Jo

http://doi.org/10.5626/JOK.2020.47.7.665

Existing speech-based emotion recognition studies can be classified into the case of using a voice feature value and a variety of voice feature values. In the case of using a voice feature value, there is a problem that it is difficult to reflect the complex factors of the voice such as loudness, overtone structure, and range of voices. In the case of using various voice feature values, studies based on machine learning comprise a large number, and there is a disadvantage in that emotion recognition accuracy is relatively lower than that of deep learning-based studies. To resolve this problem, we propose a speech emotion recognition model based on a CNN(Convolutional Neural Network) using Mel-Spectrogram and Mel Frequency Cepstral Coefficient (MFCC) as voice feature values. The proposed model applied transfer learning and attention to improve learning speed and accuracy, and achieved 77.65% emotion recognition accuracy, showing higher performance than the comparison works.

Automatic Test Case Generation through Concolic Testing to Improve SW Quality of Defense Weapon System

Kunwoo Park, Joohyun Lee, Hyunggon Song, Kyu Tae Cho, Yunho Kim, Moonzoo Kim

http://doi.org/10.5626/JOK.2019.46.9.926

To improve SW quality of defense weapon system, automatic and systematic generation of test cases is necessary; however, that is not the case in the traditional practice of labor-intensive and manual SW testing. The paper applies concolic testing to the defense weapon system SW, effectively generates test cases that achieve high coverage, and discovers defects which contributes to the improvement in SW quality. Also, two methods are proposed using 4 search strategies in concolic testing and using LIA logic, to increase the efficiency of concolic testing for a program with high complexity. In addition, a symbolic modeling method is proposed as an example to extend concolic testing for practitioners.

Automated Code Generation Framework for Industrial Automation Applications based on Timed Automata Model

Kyunghyun Lee, Ikhwan Kim, Taehyoun Kim

http://doi.org/10.5626/JOK.2017.44.12.1307

Due to their convergence with state-of-the-art ICT technologies, the complexity and reliability demands of industrial automation systems have been rapidly increasing. In recent years, to cope with these demands, several research works have been carried out to adopt formal methods to the application development cycle at the early design stage. In this paper, we propose an automated code generation framework for industrial automation applications, based on a timed automata model. As a case study, we developed a formal model for a traffic light control system and verified the timing properties described in the specification. We finally demonstrated that the operation of a test-bed based on the auto-generated native code was identical to that of the model specification.

A Pedestrian Detection Method using Deep Neural Network

Su Ho Song, Hun Beom Hyeon, Hyun Lee

http://doi.org/

Pedestrian detection, an important component of autonomous driving and driving assistant system, has been extensively studied for many years. In particular, image based pedestrian detection methods such as Hierarchical classifier or HOG and, deep models such as ConvNet are well studied. The evaluation score has increased by the various methods. However, pedestrian detection requires high sensitivity to errors, since small error can lead to life or death problems. Consequently, further reduction in pedestrian detection error rate of autonomous systems is required. We proposed a new method to detect pedestrians and reduce the error rate by using the Faster R-CNN with new developed pedestrian training data sets. Finally, we compared the proposed method with the previous models, in order to show the improvement of our method.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr