Vol. 49, No. 8,
Aug. 2022
Digital Library
Algorithms for the k-Scaled Order-Preserving Pattern Matching Problem
Kyung Bin Park, Youngho Kim, Joong Cha Na, Jeong Seop Sim
http://doi.org/10.5626/JOK.2022.49.8.585
Two strings of the same length are order-isomorphic if the relative orders of their characters are the same. Given text T of length n and pattern P of length m, the order-preserving pattern matching problem is to find all substrings of T that are order-isomorphic to P. Order-preserving pattern matching can be used to analyze time-series data such as stock indices and melodies. In this paper, we defined the k-scaled order-preserving pattern matching problem and proposed an O(n+mlogm)-time algorithm for the problem. We also proposed a parallel algorithm for the problem, which runs in O(m+k) time using O(n+m) threads.
The Multivariate Sensor Data Classification using Time Series Imaging
http://doi.org/10.5626/JOK.2022.49.8.593
Various methods have been proposed in order to predict the future, from statistical-based time series analysis methods to deep learning-based prediction models, such as LSTM. However, the real industry data are highly complex due to various unpredictable factors. Therefore, it is difficult for the prediction models alone to extract valuable information from the data. Time series imaging is a method for converting time series into two-dimensional images, enabling the extraction of information that is difficult to interpret from raw data. In this paper, we transform the multivariate sensor data into two-dimensional multichannel images, and based on them, we propose a time series classification method. Furthermore, we compare the proposed method with the previous time series prediction methods to verify its usefulness.
Response-Considered Query Token Importance Weight Calculator with Potential Response for Generating Query-Relevant Responses
So-Eon Kim, Choong Seon Hong, Seong-Bae Park
http://doi.org/10.5626/JOK.2022.49.8.601
The conversational response generator(CRG) has made great progress through the sequence-to-sequence model, but it often generates an over-general response which can be a response to all queries or an inappropriate response. Some efforts have been made to modify the traditional loss function to solve this problem and reduce the generation of irrelevant responses to the query by solving the problem of the lack of background knowledge of the CRG, but they did not solve both problems. This paper propose the use of a query token importance calculator because the cause of generating unrelated and overly general responses is that the CRG does not capture the core of the query. Also, based on the theory that the questioner induces a specific response from the listener and designs the speech, this paper proposes to use the golden response to understand the core meaning of the query. The qualitative evaluation confirmed that the response generator using the proposed model was able to generate responses related to the query compared to the model that did not use the proposed model.
KcBert-based Movie Review Corpus Emotion Analysis Using Emotion Vocabulary Dictionary
Yeonji Jang, Jiseon Choi, Hansaem Kim
http://doi.org/10.5626/JOK.2022.49.8.608
Emotion analysis is the classification of human emotions expressed in text data into various emotional types such as joy, sadness, anger, surprise, and fear. In this study, using the emotion vocabulary dictionary, the emotions expressed in the movie review corpus were classified into nine categories: joy, sadness, fear, anger, disgust, surprise, interest, boredom, and pain to construct an emotion corpus. Then, the performance of the model was evaluated by training the emotion corpus in KcBert. To build the emotion analysis corpus, an emotion vocabulary dictionary based on a psychological model was used. It was judged whether the vocabulary of the emotion vocabulary dictionary and the emotion vocabulary displayed in the movie review corpus matched, and the emotion type matching the vocabulary appearing at the end of the movie review corpus was tagged. Based on the performance of the emotion analysis corpus constructed in this way by training it on KcBert pre-trained with NSMC, KcBert showed excellent performance in the model classified into 9 types.
Korean Dependency Parsing using Subtree Linking based on Machine Reading Comprehension
Jinwoo Min, Seung-Hoon Na, Jong-Hoon Shin, Young-Kil Kim, Kangil Kim
http://doi.org/10.5626/JOK.2022.49.8.617
In Korean dependency parsing, biaffine attention models have shown state-of-the-art performances; they first obtain head-level and modifier-level representations by applying two multi-layer perceptrons (MLP) on the encoded contextualized word representation, perform the attention by regarding modifier-level representation as a query and head-level one as a key, and take the resulting attention score as a probability of forming a dependency arc between the corresponding two words. However, given two target words (i.e., candidate head and modifier), biaffine attention methods are basically limited to their word-level representations, not being aware of the explicit boundaries of their phrases or subtrees. Thus, without relying on semantically and syntactically enriched phrase-level and subtree-level representations, biaffine attention methods might be not effective in the case that determining a dependency arc is not simple but complicated such as identifying a dependency between “far-distant” words, where these cases may often require subtree or phrase-level information surrounding target words. To address this drawback, this paper presents the use of dependency paring framework based on machine reading comprehension (MRC) that explicitly utilizes the subtree-level information by mapping a given child subtree and its parent subtree to a question and an answer, respectively. The experiment results on standard datasets of Korean dependency parsing shows that the MRC-based dependency paring outperforms the biaffine attention model. In particular, the results further given observations that improvements in performances are likely strong in long sentences, comparing to short ones.
Graph Convolution Network Based Feature Map Fusion Method for Multi Scale Object Detection
Jaegi Hwang, Seongju Kang, Kwangsue Chung
http://doi.org/10.5626/JOK.2022.49.8.627
Feature Pyramid Network (FPN) is a feature map fusion technique used to solve the multi-scale problem of object detection. However, since FPN performs feature map fusion by focusing on adjacent resolutions, there is a problem in that semantic information included in non-adjacent layers is diluted. This paper, proposes a graph convolution network (GCN)-based feature map fusion technique for multi-scale object detection. The proposed GCN-based method dynamically fuses feature map information of all layers according to learnable adjacency matrix weights. The adjacency matrix weight is generated based on the multi-scale attention mechanism to adaptively reflect the scale information of the object. The feature map fusion process is performed through a matrix multiplication operation between adjacency matrix and a feature node matrix. The performance of the proposed method was verified by showing that it improves the multi-scale object detection performance in the PASCAL-VOC benchmark dataset compared to the existing FPN method.
An Empirical Study on Defects in Open Source Artificial Intelligence Applications
Yoon Ho Choi, Changgong Lee, Jaechang Nam
http://doi.org/10.5626/JOK.2022.49.8.633
The differences between the programming paradigm of applications using artificial intelligence (AI) and traditional applications may show different results in detecting, understanding, analyzing, and fixing defects. In this study, we collect defects that have been reported in open source AI applications and identify common causes of the defects to understand and analyze them in AI-based systems. To this end, we analyze the defects of ten open-source AI applications archived on GitHub by inspecting 1,205 issues and defect-fixing code changes that had been reported, found, and fixed. We classified the defects into 20 categories based on their causes, which are found in at least five out of ten projects. We expect that the result of this study will provide useful information in software quality assurance approaches such as fault localization and patch suggestion.
A Network Topology Scaling Method for Improving Network Comparison Using Colon Cancer Transcriptome Data
http://doi.org/10.5626/JOK.2022.49.8.646
Various research methods have been proposed based on gene expression information in the disease analysis model. In cancer transcriptome data analysis, methods of discovering hidden characteristics based on pathways are useful for the interpretation of results. In this study, the gene correlation network in the pathway unit was compared and analyzed based on the gene co-expression data. If there is a difference in the size of the two networks to be compared, the bias of the amount of information results in biased network information on a larger scale. To resolve this bias, the network of patients from different backgrounds was adjusted using the same amount of information in the network configuration. Normalized networks applied comparative analysis of important gene groups using the characteristics of biological networks, normalized 202 pathways networks using data of subtypes of total 4 types of colon cancer, and identified 5 pathways with specific results among subspecies.
An Efficient RocksDB Leveling Technique using F2FS Multi-Head Logging
Jeongho Lee, Jonggyu Park, Young Ik Eom
http://doi.org/10.5626/JOK.2022.49.8.655
RocksDB has been considered one of the most representative LSM-tree based key-value stores, and it is actively used in high-performance database systems. However, because of the nature of such database systems, which run for an extended period of time and frequently write to the underlying storage devices, the systems may incur file system-level fragmentation. Additionally, various optimizations in RocksDB may accelerate the file system-level fragmentation under aged systems, which hinders the maintenance of long-term superior performance of flash-based storage devices such as SSDs. In this paper, we first analyze the fragmentation problem of RocksDB on F2FS and propose a new RocksDB leveling technique that exploits F2FS multi-head logging. The experimental results using an SSD confirm that the proposed method improves the throughput by 7% and reduces tail latency by 18%, compared with the conventional F2FS file system, and improves the throughput by 56% and reduces tail latency by 19%, compared with the EXT4 file system.
Deep Reinforcement Learning based MCS Decision Model
A-Hyun Lee, Hyeongho Bae, Young-Ky Kim, Chong-kwon Kim
http://doi.org/10.5626/JOK.2022.49.8.663
In wireless mobile communication systems, link adaptation techniques are used to increase channel throughput and frequency efficiency to adaptively adjust transmission parameters according to the changes in the channel state. Adaptive modulation and coding is a link adaptation technique that determines predefined modulation and coding scheme depending on the channel condition and performed based on the reported CQI from UE and HARQ feedback on packet transmission. In this paper, we propose an MCS decision model that applies deep reinforcement learning to adaptive modulation and coding. The proposed model adaptively determines the MCS level in a dynamically changing network, thereby increasing the transmission efficiency of UEs. We evaluated our proposed model through UE log-based simulations and demonstrated that our model performs much better than the existing outer loop rate control method.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr