Digital Library[ Search Result ]
Transformer-Based Head Motion Prediction Algorithm Using Image Generation Model
Hyogeun Byun, Moonsoo Jeong, Sungkil Lee
http://doi.org/10.5626/JOK.2024.51.7.601
Motion-to-photon latency in virtual reality based on head-mounted display can cause discomfort such as cyber sickness due to a lag between a user's physical movement and the image output, potentially disrupting users’ immersion. Traditional methods to reduce this latency involve manually analyzing head motion trends or predicting head motion with recurrent neural networks. However, these models faced long-term dependency issues in remembering information over sequences and limitations in parallel processing. In this paper, images are also used in the decoding process through an image generation model. A deep learning model used in natural language processing is highly scalable when using it as a prediction model. Accordingly, the model proposed in this study could use additional data to predict the user’s head motion and thereby, outperforms the existing models.
Improvement of Background Inpainting using Binary Masking of a Generated Image
Jihoon Lee, Chan Ho Bae, Seunghun Lee, Myung-Seok Choi, Ryong Lee, Sangtae Ahn
http://doi.org/10.5626/JOK.2024.51.6.537
Recently, image generation technology has been rapidly advancing in the field of deep learning. One of the most effective ways to represent images is by using text prompts to generate them. The performance of models that generate images using this technique is outstanding. However, it is not easy to naturally change specific parts of an image using only text prompts. This is considered a typical problem with conventional image generation models. Thus, in this study, we developed a background inpainting technique that extracts text for each area of an image and uses it as a basis to seamlessly change the background while preserving the objects in the image. In particular, the background transformation inpainting technique developed in this study has the advantage of not only transforming a single image but also rapidly transforming multiple images. Therefore, the proposed text prompt-based image style transfer can be used in fields with limited data for training, and the technique could enhance the performance of models through image augmentation.
Application of OOD Detection Using MSP in EEG-Based Emotion Classification
HyoSeon Choi, Dahoon Choi, Byung Hyung Kim
http://doi.org/10.5626/JOK.2024.51.5.438
Several deep learning approaches have recently improved the performance of emotion classification tasks. However, these successful applications cannot be directly applied to learning EEG signals because of their nonlinear and complex data structure. This limitation leads to inter- and intra-subject variability problems for understanding complex emotion dynamics. To address this limitation, we focus on studying the variability rather than extracting features from high-dimensional neural activities. In the context of deep learning, we propose a framework to detect and remove abnormal pairs of EEG data and labels for enhancing model performance by utilizing the Maximum Softmax Probability approach. Experimental results on public datasets demonstrated the superiority of our method with a maximum improvement of 4% in accuracy.
Pseudo-label Correction using Large Vision-Language Models for Enhanced Domain-adaptive Semantic Segmentation
http://doi.org/10.5626/JOK.2024.51.5.464
It is very expensive to make semantic segmentation labels for real-world images. To solve this problem in unsupervised domain adaptation, the model is trained by using data generated in a virtual environment that can easily collect labels or data is already collected and real-world images without labels. One of the common problems in unsupervised domain adaptation is that thing classes with similar appearance are easily confused. In this paper, we propose a method of calibrating the label of the number of target data using large vision-language models. Making the number of labels generated for the target image more accurate can reduce confusion among thing classes. The proposed method improves the performance of DAFormer by +1.1 mIoU in adaptation from game to reality and +1.1 mIoU in adaptation from day to night. For thing classes, the proposed method improved the performance of the MIC by +0.6 mIoU in adaptation from game to reality and +0.7 mIoU in adaptation from day to night.
Number-based High Fidelity Logical Qubit on the Surface Code FTQC
http://doi.org/10.5626/JOK.2024.51.4.301
To achieve the quantum advantage, the quantum computing size should be increased. However, the quantum computational power cannot be easily increased because of the high error rates on qubits and gates. To overcome such problem, the surface code based on the fault-tolerant quantum computation model has been investigated a lot since it works with relatively higher error rates in theory. However, In practice, we need many improvements on the surface code such as the requirement of a large number of physical qubits. Therefore, in this work, we propose a logical qubit design method, which exploits the multiple lower level qubits unlike the conventional bigger-sized qubit design method. This method uses the concept of the block-code scheme. The analysis result shows that the proposed method achieves a lower error rate than the bigger-sized logical qubits with the same number of physical qubits. In conclusion, we believe this approach can improve the resource efficiency of the surface code FTQC.
An Automated Error Detection Method for Speech Transcription Corpora Based on Speech Recognition and Language Models
Jeongpil Lee, Jeehyun Lee, Yerin Choi, Jaehoo Jang, Myoung-Wan Koo
http://doi.org/10.5626/JOK.2024.51.4.362
This research proposes a "machine-in-the-loop" approach for automatic error detection in Korean speech corpora by integrating the knowledge of CTC-based speech recognition models and language models. We experimentally validated its error detection performance through a three-step procedure that leveraged Character Error Rate (CER) from the speech recognition model and Perplexity (PPL) from the language model to identify potential transcription error candidates and verify their text labels. This research focused on the Korean speech corpus, KsponSpeech, resulting in a reduction of the character error rate on the test set from 9.44% to 8.9%. Notably, this performance enhancement was achieved even when inspecting only approximately 11% of the test data, highlighting the higher efficiency of our proposed method than a comprehensive manual inspection process. Our study affirms the potential of this efficient "machine-in-the-loop" approach for a cost-effective error detection mechanism in speech data while ensuring accuracy.
Design and Implementation of a Tactical-datalink Unit for Interfacing and Forwarding(TUF) to Interlock and Forward MIDS JTRS
Sangtae Lee, Jongseo Kim, Taegwon Kim, Youngseung Kim, Seungbae Jee, Jaeyoung Cheon, MinGyu Jung
http://doi.org/10.5626/JOK.2024.51.3.252
The MIDS LVT terminal is changed to MIDS JTRS terminals with improved crypto modernization, enhanced throughput and frequency remapping functions due to encryption keys and bandwidth limitations. The Korean military is carrying out various performance improvement projects for the Link-16 network equipped with a MIDS JTRS terminal. In this paper, a tactical datalink unit for interfacing and forwarding(TUF) was designed and implemented to interlock and forward tactical information to the existing C2 host system (MCRC, KTMO, etc.) using MIDS JTRS terminal.
The TUF had improved scalability and maintainability through a server-client structure design that could remotely control and manage MIDS JTRS terminal interlocking and forwarding in consideration of the network operating environment. The TUF consisted of interlocking and forwarding, a basic C2 (command & control) host, and a monitoring tool. The TUF was verified through linkage with previously verified overseas tools. The TUF secured technical know-how for MIDS JTRS terminal integration by achieving the purpose of interlocking and forwarding through domestic technology development, away from the existing method of interlocking and forwarding through overseas tools. The existing weapon system without MIDS JTRS terminal linkage function could join the Link-16 network to enhance military operational operability and survivability.
Explainable Artificial Intelligence in Molecular Graph Classification
Yeongyeong Son, Yewon Shin, Sunyoung Kwon
http://doi.org/10.5626/JOK.2024.51.2.157
With the advancement of artificial intelligence (AI), there is a growing need for explainable artificial intelligence (XAI). Recently, Graph neural network-based XAI research has been actively conducted, but it mainly focuses on generic graphs. Due to the distinctive characteristics relying on the chemical properties of molecular graphs, we emphasize the necessity for research to investigate whether existing XAI techniques can provide interpretability in molecular graphs. In this paper, we employ existing XAI techniques to molecular graphs and assess them quantitatively and qualitatively to see their interpretability. Furthermore, we examine the outcomes after standardizing the significance ratio of essential features, highlighting the significance of sparsity as one of the XAI evaluation metrics.
SBERT-PRO: Predicate Oriented Sentence Embedding Model for Intent and Event Detection
Dongryul Ko, Jeayun Lee, Dahee Lee, Yuri Son, Sangmin Kim, Jaeeun Jang, Munhyeong Kim, Sanghyun Park, Jaieun Kim
http://doi.org/10.5626/JOK.2024.51.2.165
Intent detection is a crucial task in conversational systems for understanding user intentions. Additionally, event detection is vital for identifying important events within various texts, including news articles, social media posts, and reports. Among diverse approaches, the sentence embedding similarity-based method has been widely adopted to solve open-domain classification tasks. However, conventional embedding models tend to focus on specific keywords within a sentence and are not suitable for tasks that require a high-level semantic understanding of a sentence as opposed to a narrow focus on specific details within a sentence. This limitation becomes particularly evident in tasks such as intent detection, which requires a broader understanding of the intention of a sentence, and event detection, which requires an emphasis on actual events within a sentence. In this paper, we construct a training dataset suitable for intent and event detection using entity attribute information and entity relation information. Our approach is inspired by the significance of emphasizing the embedding of predicates, which unfold the content of a sentence, as opposed to focusing on entity attributes within a sentence. Furthermore, we suggest an adaptive learning strategy for the existing sentence embedding model and demonstrate that our proposed model, SBERT-PRO (PRedicate Oriented), outperforms conventional models
A Study of Metric and Framework Improving Fairness-utility Trade-off in Link Prediction
Heeyoon Yang, YongHoon Kang, Gahyung Kim, Jiyoung Lim, SuHyun Yoon, Ho Seung Kim, Jee-Hyong Lee
http://doi.org/10.5626/JOK.2023.50.2.179
The advance in artificial intelligence (AI) technology has shown remarkable improvements over the last decade. However, sometimes, AI makes biased predictions based on real-world big data that intrinsically contain discriminative social factors. This problem often arises in friend recommendations in Social Network Services (SNS). In the case of social network datasets, Graph Neural Network (GNN) is utilized for training these datasets, but it has a high tendency to connect similar nodes (Homophily effect). Furthermore, it is more likely to make a biased prediction based on socially sensitive attributes, such as, gender or religion, making it ethically more problematic. To overcome these problems, various fairness-aware AI models and fairness metrics have been proposed. However, most of the studies used different metrics to evaluate fairness and did not consider the trade-off relationship that existed between accuracy and fairness. Thus, we propose a novel fairness metric called Fairβ-metri which takes both accuracy and prediction into consideration, and a framework called FairU that shows outstanding performance in the proposed metric.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr