Application Monitoring System Design and Implementation using System Call Pattern

Haegeon Jeong, Kyungtae Kang

http://doi.org/10.5626/JOK.2022.49.10.795

A user application consists of a set of functions. An application gives a set of functions to do what the user needs. Applications that provide services such as web servers are very large and complex, making them a target for attackers. As a result of attacks by malicious hackers, application variables and program flow are distorted, leading to the hijacking of system administrator privileges or abnormal operations. In this paper, we designed and implemented a system that collects an application"s system call and detects anomalies in applications through the collected patterns. As a result of measuring the overhead through the actually implemented system, it was found that when about 1 million system calls were monitored, it had an overhead of about 0.8 seconds. This is about 1/28 of the overhead time of existing tools such as strace.

VACS: Virtual Try-on Artifact Correction System using the Fashion Object Segmentation Method

Wonjung Park, Youjin Chung, Soonchan Park, Jinah Park

http://doi.org/10.5626/JOK.2022.49.10.802

Virtual try-on (VITON) technology is receiving a lot of attention with the development of Generative Adversarial Networks (GANs) [1]. Previous approaches to VITON synthesized 2D model images and in-shop clothing images using a generative model. However, when synthesizing the top, VITON erroneously changes pixels in unintended areas, such as the background and pants. In this study, we propose the VITON Artifact Correction System (VACS), which divides and protects targeted clothes synthesized in VITON by fashion object segmentation, and replaces the pixels corresponding to the remaining areas with the original model image to increase the realism of the final composition.

A Graph2Tree Model for Solving Korean Math Word Problems

Donggeun Kim, Nayeon Lee, Hyunwoo Sim, Myoung-Wan Koo

http://doi.org/10.5626/JOK.2022.49.10.807

This paper builds its own data set of eight types of Korean math word problems and presents a Ko-Graph2Tree model, an automatic solution model for Korean math word problems based on Graph2Tree model not previously presented. The recently released Graph2Tree model is a graph-to-tree learning based model that shows better performance than existing natural language processing models for automatic solving English math word problems. Using two types of graphs reflecting the relationship and order between numbers in the problem text, that is, mathematical relations, in solution generation, the model showed improved performance compared to existing tree-based models. As a result of measuring performance after learning with a self-produced Korean math word problem dataset, the transformer model with sequence-to-sequence structure showed an accuracy of 42.3%, whereas the Ko-Graph2Tree model showed an accuracy of 68.3%, resulting in 26.0%p higher performance.

Performance-Aware Multi-Cloud Infrastructure Provisioning Based on Cloud-Barista Open Source Project

Seokho Son, Jihoon Seo, Byoungseob Kim, Dongjae Kang

http://doi.org/10.5626/JOK.2022.49.10.816

Cloud infrastructures have been expanding all over the world, and types of services have been diversified. Cloud users are adopting multi-cloud, in which two or more clouds are utilized to overcome restrictions that might arise from using a single cloud. However, using multi-cloud increases the complexity of provisioning and managing cloud resources. In this paper, to alleviate problems, we researched a way to efficiently provision multi-cloud infrastructures. In particular, we analyzed the performance of each cloud service through performance benchmarking and proposed a performance-based optimal provisioning technique. The main contribution points of this paper are as follows: 1) it introduces a dynamic provisioning structure of multi-cloud infrastructures implemented through CB-Tumblebug of the Cloud-Barista open source project, 2) it presents a mechanism for evaluating heterogeneous cloud infrastructures" performances, and 3) it analyzes results of performance experiments for major cloud infrastructures. Lastly, we demonstrated the process of configuring and provisioning multi-cloud infrastructures based on their performances through CB-MapUI, a web client for CB-Tumblebug. The effectiveness of the appropriate multi-cloud infrastructure configuration can be examined through the experimental results and demonstrations.

Performance Evaluation Technique of Learning Model Based on Feature Cluster in Sensing Data of Collaborative Robots

Jinse Kim, Subin Bea, Ye-Seul Park, Jung-Won Lee

http://doi.org/10.5626/JOK.2022.49.10.824

Recently, attempts have been made to apply an artificial intelligence model to PHM(Prognostics and Health Management) of collaborative robots, a representative equipment of smart factories. However, typical models are developed in a heuristic way without preprocessing or analysis of sensing data collected by operating test programs. Therefore, in this paper, we proposed a model performance evaluation method based on feature cluster concept which could analyze features of time series sensing data with cycles collected from cooperative robots. To demonstrate the effectiveness of the proposed method, we applied it to a program classification model, an internal component of the motion fault detection network, and identified characteristics of data that contributed to performance degradation, which has not been revealed by existing method. This results enabled a qualitative evaluation of the performance of the model and provided directions to improving model performance.

A Time-Course Multi-Clustering Method for Single-Cell Trajectory Inference

Jaeyeon Jang, Inuk Jung

http://doi.org/10.5626/JOK.2022.49.10.838

From time-series single-cell transcriptome data, gene expression information can be generated to observe the timing of significant cell differentiation changes while accounting for important biological phenomena in relation to experimental conditions. Due to recent surge of time-series single-cell transcriptome data, studies on various dynamic variation in cells such as cell cycle and cell differentiation have been actively conducted. Particularly, time series analysis at single-cell level for cell differentiation is advantageous for biological interpretation compared to a single time point as it is possible to observe changes in the time axis. In this paper, we proposed a multi-clustering method to infer cell trajectory by considering time information at the genetic-level of time-series single-cell transcriptome data. Analyses of gene expression data on the development of human neuron cell differentiation using this method showed similar results to biological results uncovered in a previous study.

Contract Eligibility Verification Enhanced by Keyword and Contextual Embeddings

Sangah Lee, Seokgi Kim, Eunjin Kim, Minji Kang, Hyopil Shin

http://doi.org/10.5626/JOK.2022.49.10.848

Contracts need to be reviewed to be verified if they include all the essential clauses for them to be valid. Such clauses are highly formal and repetitive regardless of the kinds of contracts, and automated legal technologies are required for legal text comprehension. In this paper, we have constructed a simple item-by-item classification model for clauses in contracts to estimate contract eligibility by addressing formal and repetitive properties of contract clauses. We have used keyword embeddings based on conventional requirements of contracts and concatenate them to sentence embeddings of clauses, extracted from a BERT model fine-tuned with legal documents. The contract eligibility can be verified by the predicted labels. Based on our methods, we report reasonable performances with the accuracy of 90.57 and 90.64, and an F1-score of 93.27 and 93.26, using additional keyword embeddings with BERT embeddings.

Fast Personalized PageRank Computation on Very Large Graphs

Sungchan Park, Youna Kim, Sang-goo Lee

http://doi.org/10.5626/JOK.2022.49.10.859

Computation of Personalized PageRank (PPR) in graphs is an important function that is widely utilized in myriad application domains such as search, recommendation, and knowledge discovery. As the computation of PPR is an expensive process, a good number of innovative and efficient algorithms for computing PPR have been developed. However, efficient computation of PPR within very large graphs with over millions of nodes is still an open problem. Moreover, previously proposed algorithms cannot handle updates efficiently, thereby severely limiting their capability of handling dynamic graphs. In this paper, we present a fast converging algorithm that guarantees high and controlled precision. We attempted to improve the convergence rate of the traditional Power Iteration approximation methods and fully exact methods. The results revealed that the proposed algorithm is at least 20 times faster than the Power Iteration and outperforms other state-of-the-art algorithms in terms of computation time.

Analysis of QoQ GDP Prediction Performance Using Deep Learning Time Series Model

Yeonhee Lee, Youngmin Kim, Taewan You

http://doi.org/10.5626/JOK.2022.49.10.873

In this paper, we proposed an algorithm for predicting GDP growth rate using a deep learning time series model spotlighted recently. The proposed algorithm adopts an ensemble deep learning method to ensure stable prediction performance using a large number of economic time series data with low frequency. It also uses a gradual learning method to ensure adaptive performance even in business fluctuations. By demonstrating that the performance could be improved by using economic sector information in learning, the necessity of convergence with domain knowledge was confirmed and the importance of AI operation technology to provide adaptive predictive power was emphasized. Through performance comparison with traditional machine learning models for the COVID-19 period, we proved that deep learning could be a relatively reasonable predictive tool under rapid economic fluctuations. The deep learning-based adaptive AI algorithm presented in this paper is expected to be developed into a deep learning-based autonomous adaptive economic prediction system through combination with AI operation technology.

Structuralized External Knowledge and Multi-task Learning for Knowledge Selection

Junhee Cho, Youngjoong Ko

http://doi.org/10.5626/JOK.2022.49.10.884

Typically, task-oriented dialog systems use well-structured knowledge, such as databases, to generate the most appropriate responses to users" questions. However, to generate more appropriate and fluent responses, external knowledge, which is unstructured text data such as web data or FAQs, is necessary. In this paper, we propose a novel multi-task learning method with a pre-trained language model and a graph neural network. The proposed method makes the system select the external knowledge effectively by not only understanding linguistic information but also grasping the structural information latent in external knowledge which is converted into structured data, graphs, using a dependency parser. Experimental results show that our proposed method obtains higher performance than the traditional bi-encoder or cross-encoder methods that use pre-trained language models.

Entity Graph Based Dialogue State Tracking Model with Data Collection and Augmentation for Spoken Conversation

Haeun Yu, Youngjoong Ko

http://doi.org/10.5626/JOK.2022.49.10.891

As a part of a task-oriented dialogue system, dialogue state tracking is a task for understanding the dialogue and extracting user’s need in a slot-value form. Recently, Dialogue System Track Challenge (DSTC) 10 Track 2 initiated a challenge to measure the robustness of a dialogue state tracking model in a spoken conversation setting. The released evaluation dataset has three characteristics: new multiple value scenario, three-times more entities, and utterances from automatic speech recognition module. In this paper, to ensure the model’s robust performance, we introduce an extraction-based dialogue state tracking model with entity graph. We also propose to use data collection and template-based data augmentation method. Evaluation results prove that our proposed method improves the performance of the extraction-based dialogue state tracking model by 1.7% of JGA and 0.57% of slot accuracy compared to baseline model.

Spatial LSM Tree for Indexing Blockchain-based Geospatial Point Data

Minjun Seo, Taehyeon Kwon, Sungwon Jung

http://doi.org/10.5626/JOK.2022.49.10.898

Blockchain technology is attracting attention for its high usability in various fields such as IoT and healthcare, and it is being used as an alternative to distributed databases. Despite their high usability for blockchain, the techniques for efficiently indexing blockchain-based geospatial data have not been studied much until now. Therefore, in this paper, we propose a spatial LSM tree indexing method that reduces the I/O cost when a block of geospatial point data is inserted into a blockchain by reflecting the write-intensive features of the blockchain. The proposed method linearizes geospatial data through Geohash on the blockchain where a large scale of real-time updates occur. It also minimizes the I/O cost when processing a range query and inserting data into the blockchain by taking the spatial proximity of the point data into account. Also, we propose a spatial filter to reduce unnecessary traversal of spatial LSM tree for processing geospatial point data range queries.

Identifying C# Obfuscation Tools Using API Sequence Analysis

Taekwang Hur, Yeoneo Kim, Junseok Cheon, Woojae Jo, Dongsu Song, Gyun Woo

http://doi.org/10.5626/JOK.2022.49.10.906

With the development of IT, the production of software is increasing and obfuscation technology is actively being used to protect it. However, obfuscation technology is also used to hide malicious code. The use of .NET has increased recently, and malicious codes using obfuscation technology are also growing. Although the programs to which such .NET obfuscation technology is applied can be analyzed through de4dot, it does not correctly detect obfuscation tools and it is vulnerable to obfuscation avoidance techniques. This study proposes a method to solve this problem. Specifically, the automatic classification of obfuscation tools is suggested based on the similarity between programs through sequence analysis using API, which can be considered a feature of the program. To measure the performance of the proposed method, we conducted experiments by applying seven obfuscation tools to five programs. The experimental results showed that, de4dot had 42.8% accuracy, and the proposed system showed 78.5% accuracy, which was higher than that of de4dot. In addition, all five programs to which obfuscation avoidance techniques were applied were classified accurately.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr