Vol. 50, No. 8,
Aug. 2023
Digital Library
Learning Functional Characteristics of Malware Attacks with Graph Transformer based on Control Flow
http://doi.org/10.5626/JOK.2023.50.8.633
To minimize false negatives in malware classification, it is important to capture local characteristics of a program, such as the control flow between operation blocks and memory-register addresses. However, existing methods that optimize the loss function of a classifier without considering the functional characteristics of malware have limitations in recall due to new attack paths and complex control flow graphs. In this paper, we propose a method that explicitly samples and embeds the control flow graphs to learn functional characteristics, such as API calls, rootkit DLL installation, and specific virtual memory access, and improve recall. To model the functional patterns of malware from the control flow graphs, we sample attack paths from the control flow of the malware and classify the types of malware using a graph embedding function based on the transformer. We evaluate the proposed method using a real-world malware benchmark dataset, Microsoft Challenge. By explicitly learning the control flow of the malware, we achieved a recall of 97.89% and significantly improved the accuracy (99.45%) compared to the latest and most advanced method"s classification accuracy (97.89%).
Code Generation and Data Layout Transformation Techniques for Processing-in-Memory
Hayun Lee, Gyungmo Kim, Dongkun Shin
http://doi.org/10.5626/JOK.2023.50.8.639
Processing-in-Memory (PIM) capitalizes on internal parallelism and bandwidth within memory systems, thereby achieving superior performance to CPUs or GPUs in memory-intensive operations. Although many PIM architectures were proposed, the compiler issues for PIM are not currently well-studied. To generate efficient program codes for PIM devices, the PIM compiler must optimize operation schedules and data layouts. Additionally, the register reuse of PIM processing units must be maximized to reduce data movement traffic between host and PIM devices. We propose a PIM compiler, which can support various PIM architectures. It achieves up to 2.49 times performance improvement in GEMV operations through register reuse optimization.
Review-based Personalized Recommendation System using Effective Personalized Fusion and BERT
http://doi.org/10.5626/JOK.2023.50.8.646
Generally, review texts contain personal information from users, and reviews written by users can have different meanings, even if they use the exact wording. These review features can be used to compensate for the shortcomings of collaborative filtering, which is vulnerable to data sparsity. They can also be used as information for personalized recommendation systems. Despite the success of pre-trained language models in natural language processing, there has been little research on personalized recommendation systems that leverage BERT to enrich individual user features from reviews. In this work, we propose a rating prediction model that uses BERT for detailed learning of user and item-specific features from reviews and tightly combine them with user and product IDs to represent personalized user and item. Experiments results show that the proposed model can achieve improved performance over the baseline on the Amazon benchmark dataset.
Non-autoregressive Korean Morphological Analysis with Word Segment Information
http://doi.org/10.5626/JOK.2023.50.8.653
This paper introduces a non-autoregressive Korean morphological analyzer. The proposed morphological analyzer utilizes a transformer encoder to encode a given sentence and employs two non-autoregressive decoders for morphological analysis. Each decoder generates a morpheme sequence and a corresponding POS tag sequence, which are then combined to produce the final morphological analysis. Additionally, this paper leverages word segment information within the sentence to predict the target sequence length, mitigating performance degradation resulting from incorrect target sequence length predictions. Experimental results show that the proposed non-autoregressive Korean morphological analyzer outperforms all non-autoregressive baselines. It achieves comparable accuracy to an autoregressive Korean morphological analyzer while it performs nearly 14.76 times faster than the autoregressive Korean morphological analyzer.
Deep Neural Network-Based Automated Essay Trait Scoring Model Incorporating Argument Structure Information
http://doi.org/10.5626/JOK.2023.50.8.662
Automated essay scoring is the task of having a model read a given essay and evaluate it automatically. This paper presents a method for automated essay scoring by creating essay representations that reflect argument structure of the essay using Argument Mining, and learning essay representations for each trait score. Results of our experiments indicated that the proposed essay representation outperformed representations obtained from pre-trained language models. Furthermore, it was found that learning different representations for each evaluation criterion was more effective for essay evaluation. The performance of the proposed model, as measured by the Quadratic Weighted Kappa (QWK) metric, improved from 0.543 to 0.627, showing a high level of agreement with human evaluations. Qualitative evaluations also showed that the proposed model demonstrated similar evaluation tendencies to human evaluations.
Time-Series Data Augmentation Based on Adversarial Training
http://doi.org/10.5626/JOK.2023.50.8.671
Recently, time series data are being generated in various industries with advancement of the Internet of Things (IoT). Accordingly, demands for time series forecasting in various industries are increasing. With acquisition of a large amount of time-series data, studies on traditional statistical method based time-series forecasting and deep learning-based forecasting methods have become active and the need for data augmentation techniques has emerged. In this paper, we proposed a novel data augmentation method for time series forecasting based on adversarial training. Unlike conventional adversarial training, the proposed method could fix the hyperparameter about the number of adversarial training iterations and utilize blockwise clipping of perturbations. We carried out various experiments to verify the performance of the proposed method. As a result, we were able to confirm that the proposed method had consistent performance improvement effect on various datasets. In addition, unlike conventional adversarial training, the necessity of blockwise clipping and the hyperparameter value fixing proposed in this paper were also verified through comparative experiments.
A Contrastive Learning Method for Automated Fact-Checking
Seonyeong Song, Jejun An, Kunwoo Park
http://doi.org/10.5626/JOK.2023.50.8.680
As proliferation of online misinformation increases, the importance of automated fact-checking, which enables real-time evaluation, has been emphasized. In this study, we propose a contrastive learning method for automated fact-checking in Korean. The proposed method deems a sentence similar to evidence as a positive sample to determine the authenticity of a given claim. In evaluation experiments, we found that the proposed method was more effective in the sentence selection step of finding evidence sentences for a given claim than previous methods. such as a finetuned pretrained language model and SimCSE. This study shows a potential of contrastive learning for automated fact-checking.
Comparative Analysis of Accuracy and Stability of Software Reliability Estimation Models based on Recurrent Neural Networks
Taehyoun Kim, Duksan Ryu, Jongmoon Baik
http://doi.org/10.5626/JOK.2023.50.8.688
Existing studies on software reliability estimation based on recurrent neural networks have used networks to create one model under the same conditions and evaluated the accuracy of the model. However, due to the randomness of artificial neural networks, such recurrent neural networks can generate different training results of models even under the same conditions, which can lead to inaccurate software reliability estimation. Therefore, this paper compares and analyzes which recurrent neural networks could estimate software reliability more stably and accurately. We estimated software reliability in eight real projects using three representative recurrent neural networks and compared and analyzed the performances of these models in terms of accuracy and stability. As a result, Long Short-Term Memory showed the most stable and accurate software reliability estimation performance. A more accurate and stable software reliability estimation model is expected to be selected based on the results of this study.
A Model for Topic Classification and Extraction of Sentimental Expression using a Lexical Semantic Network
JiEun Park, JuSang Lee, JoonChoul Shin, ChoelYoung Ock
http://doi.org/10.5626/JOK.2023.50.8.700
The majority of the previous sentiment analysis studies classified a single sentence or document into only a single sentiment. However, more than one sentiment can exist in one sentence. In this paper, we propose a method that extracts sentimental expression for word units. The structure of the proposed model is a UBERT model that uses morphologically analyzed sentences as input and adds layers to predict topic classification and sentimental expression. The proposed model uses topic feature of a sentence predicted by topic dictionary. The topic dictionary is built at the beginning of machine learning. The learning module collects topic words from a training corpus and expands them using the lexical semantic network. The evaluation is performed with the word unit F1-Score. The proposed model achieves an F1-Score of 58.19%, an improvement of 0.97% point over the baseline.
Deep Learning-based Models for Disaster Situation Awareness and Response Support
Eunjung Kwon, Minjung Lee, Hyuinho Park, Kyu-Chul Lee
http://doi.org/10.5626/JOK.2023.50.8.712
This paper is a study on decision support models for recognizing and responding to disaster situations by the control room receptionist who performs the 119-report reception work, which is directly related to the lives and properties of the people. To provide prompt, accurate, and effective first responders to emergency reports, it is essential to systematically respond according to the received situation from the beginning of the report. However, there are limitations in making decisions based on the individual capabilities of the 119 dispatcher in the face of various reports and frequently changing field conditions. Therefore, this paper proposes a deep learning-based disaster situation awareness model and a response support model that apply to the report reception work based on the 119 situation management standard manual. Lastly, we confirm the validity of the proposed method through experiments.
Efficient Distributed Training Method Considering the Energy Level of Edge Devices in Solar-powered Edge AIoT Environments
Yeontae Yoo, IKjune Yoon, Dong Kun Noh
http://doi.org/10.5626/JOK.2023.50.8.720
Solar-powered IoT devices periodically harvest energy and therefore can fundamentally solve the energy limitation of battery-based IoT devices. However, a careful energy consumption policy is required due to the variation in the amount of energy harvested. There is a growing interest in the AI-distributed training models that can improve the quality and performance of training by conducting small training at each edge node and sharing the results with neighbors. However, the straggler node problem may occur in such distributed models, significantly decreasing the overall training speed and exponentially reducing the lifespan of the IoT network due to insufficient energy of specific nodes. This study proposes a technique to prevent the occurrence of straggler nodes as much as possible for efficient AI-distributed training in an AIoT environment composed of solar-powered devices. The proposed scheme uses an approximate computing technique that adapts energy consumption by adjusting the accuracy according to each node’s harvested energy while retaining the minimum accuracy required by the application. Among various approximation computing schemes, this study uses a data-level approximation scheme that adjusts the accuracy by adjusting the sampling rate of the sensing data. The experimental results confirm that the proposed scheme reduces the generation of straggler nodes by efficient and balanced use of each node’s harvested energy.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr