A Parallel Processing Scheme on TensorFlow for Improving Training and Validation Performance

Jinseo Choi, Donghyun Kang

http://doi.org/10.5626/JOK.2022.49.6.407

Most deep learning systems spend a lot of time on model training and validation. However, they sometimes tend to waste GPU and CPU resources because the pre-processing and batch processes based on a single thread result in a wait time. In this paper, we propose a new scheme that efficiently handles training and validation processes based on multi-threads. The proposed scheme can overlap the training and validation processes as much as possible by using a model copy operation that extends the processes with multi-threads. As a result, it improves the overall utilization of CPU and GPU. For evaluation, we implemented a convolutional neural network (CNN) using the TensorFlow framework. As a result, we clearly confirm that the proposed scheme saves the total training and validation time by up to 22.4% compared with the traditional schemes.

Video Object Detection Network by Estimation of Center and Movement of The Object by Stacking Continuous Images

Hayoung Son, Yujin Lee, Kaewon Choi

http://doi.org/10.5626/JOK.2022.49.6.416

Various obstacles such as large containers and logistics machines are placed, in an environment such as a spacious port that is difficult to monitor at once. We studied object detection methods to track very small pedestrians and port vehicle objects. Since we need to learn small objects and unclear shapes, we trained a model based on CenterNet, a network of Anchor-Free methods, and to supplement information on very small objects, we learned by stacking several consecutive images. In addition, Lack of datasets due to the special environment was solved by enhancing data that uses multiple datasets together, randomly selecting multiple still images, and processing them into a continuous image, thereby preventing overfitting.

Development of an Apartment Price Change Rate Prediction Model with Geographical Adjacency

Sunkyung Park, Minho Lee

http://doi.org/10.5626/JOK.2022.49.6.424

Recently, in the real estate market, decoupling in which housing prices fluctuate by the region has been escalating. This phenomenon implies that each region is composed of districts that are adjacent to one another. This thesis confirms that the prices of a district change in synchronization with that of the adjacent districts and proves that the fluctuations in apartment prices in the districts within Seoul are due to the neighbors. The rate of change in apartment prices, macroeconomic indicators, and private education indicators are used to test the hypothesis with a 3D (time, distance, and attribute) model, which is further deciphered using CNN. The model considers the situation of neighbors and is subdivided into the following three sub-models: consideration only for the target area (I), consideration for long-distance areas (II), and change in the number of neighbors (III). The metrics used are mean absolute error and mean directional accuracy. It was observed that the model with neighbors performed better than the persistence model and XGBoost. Furthermore, its sub-models showed good performance in the order of model III (with 3 neighbors), II, and I. This study clearly exhibits that the factor “neighbor” affects the rate of change in apartment prices.

Efficient Compilation Error Localization with DNN

Minji Bae, Jongmoon Baik

http://doi.org/10.5626/JOK.2022.49.6.434

There are few programs with no compilation errors. The compiler provides the programmers with compiler error messages as clues to solve the problem, but analyzing the error messages correctly also consumes much time. Although there are many proposals that suggest the error localization method and how to repair the error, most of the proposals are using data from novice programmers, or can be applied only to one specific programming language. It is difficult to apply practically in large-scale projects conducted in the company. In this study, to increase the efficiency of compile error handling in practical projects, we propose DeepErrorFinder which identifies the location of compilation errors using DNN. This model, which is based on the LSTM model, predicts the error location after training based on compilation error logs, and repair changes from mobile phone software development projects. As a result of the experiments, it showed an accuracy of 52% and reduced the elapsed time compared to a manual search. It can facilitate quickly finding the location of the compilation error code in practice projects.

RESEDA: Software REliability Model SElection using DAta-driven Software Reliability Prediction

Nakwon Lee, Duksan Ryu, Ilhoon Cho, Jeakun Song, Jongmoon Baik

http://doi.org/10.5626/JOK.2022.49.6.443

To solve the model generalization problem, i.e., there is no single best model that fits all types of software failure data, model selection techniques and data-driven reliability prediction techniques have been proposed. However, model selection techniques still wrongly select some failure data, and the reliability metrics that the data-driven techniques can observe are limited. In this paper, we propose a software reliability model selection technique using data-driven reliability prediction to improve the prediction accuracy with obtaining reliability metrics. The proposed approach decides either selection or data-driven for target failure data using a classifier generated from historical failure data sets. If data-driven is chosen, the proposed approach builds an augmented failure data using the prediction results of the data-driven technique and selects a model for the augmented data. The proposed approach shows a 21% lower median value of the mean error of prediction compared to the best technique for comparison. With the improved reliability prediction accuracy using the proposed approach, the higher software reliability is achieved.

Fair Hungarian Algorithm for Swarming Drone Flight Formation Transformation

SungTae Moon

http://doi.org/10.5626/JOK.2022.49.6.459

The drone show impressed people through the convergence of technology and art in the sky during 2018 Pyeongchang Winter Olympics. For the stable swarm flight, the system should consider efficient communication, accurate position estimation, and fast and efficient scenario without collision between drones. Especially, the scenario transformation algorithm is a core technology of the drone show, and can be performed as an assignment problem. Hungarian algorithm is commonly used for the assignment problem. However, Hungarian algorithm is not suitable for formation transformation of the swarm flight, because the battery usage of individual drones is not taken into account. Thus, an increase in the amount of movement of some drones increases battery consumption and reduces operating time. In this paper, the fair Hungarian algorithm is proposed to increase operating time considering fair battery consumption. The proposed algorithm was verified using the swarming flight system at a drone show performed with 100 drones.

TabNet-based Early Prediction and Interpretation of Work-in-process Inventory for Semiconductor Manufacturing

Seongjin Bang, Haeji Go, Sangmin Lee

http://doi.org/10.5626/JOK.2022.49.6.466

In this study, we propose using TabNet, a deep learning model effective for tabular dataset, to predict the average and maximum levels of WIP (work-in-process) in semiconductor plant. The WIP estimation is an essential problem for decision-making to expand factory infrastructure facilities, because the under- or over-estimation of WIP causes production inefficiency and unnecessary costs, resulting in production loss. To resolve this problem, we here present a framework, which accurately predicts the average and maximum level of WIP and analyzes the main causes of changes in the level of WIP. We conducted experimental studies to show the outperformance of TabNet with competitive machine-learning methods. Experimental results show that the proposed approach obtained R² 0.86 and 0.95 for the average level of WIP and the maximum level of WIP, respectively. Furthermore, a model-agnostic interpretation method, Shapley additive explanations, was used to identify the significant variables for predictions.

PrefixLM for Korean Text Summarization

Kun-Hui Lee, Seung-Hoon Na, Joon-Ho Lim, Tae-Hyeong Kim, Du-Seong Chang

http://doi.org/10.5626/JOK.2022.49.6.475

In this paper, we examine the effectiveness of PrefixLM that consists of half of the parameters of the T5"s encoder-decoder architecture for Korean text generation tasks. Different from T5 where input and output sequences are separately provided, the transformer block of PrefixLM takes a single sequence that concatenates both input and output sequences. By designing the attention mask, PrefixLM performs uni- and bi-directional attentions on input and output sequences, respectively, thereby enabling to perform two roles of encoder and decoder with a single transformer block. Experiment results on Korean abstractive document summarization task show that PrefixLM leads to performance increases of 2.17 and 2.78 more than 2 in Rouge-F1 score over BART and T5, respectively, implying that the PrefixLM is promising in Korean text generation tasks.

A Privacy-preserving Histogram Construction Method Guaranteeing the Differential Privacy

In Cheol Baek, Jongseon Kim, Yon Dohn Chung

http://doi.org/10.5626/JOK.2022.49.6.488

With the widespread use of data collection and analysis, the need for preserving the privacy of individuals is emerging. Various privacy models have been proposed to guarantee privacy while collecting and analyzing data in a privacy-preserving manner. Among various privacy models, the differential privacy stands as the de facto standard. In this paper, we propose a privacy-preserving histogram construction method guaranteeing differential privacy. The proposed method consists of histogram bin setting and frequency calculation stages. In the first stage, we use the Laplace mechanism to heuristic bin setting algorithms to select a differentially private number of bins. In the second stage, we use the Laplace mechanism to each frequency falling into the bins to output differentially private frequencies. We prove the proposed method guarantees differential privacy and compare the accuracy according to privacy budget values and distribution rates through experiments.

A Study on the Architecture of Cyber Public Information Forensic Tools for Investigation to Obtain the Court Evidence Ability

Jeongho Lee, Minchang Kang, HyunSeok Kang, Jaehoon Jang, Homook Cho

http://doi.org/10.5626/JOK.2022.49.6.494

Although recent development in Internet technology has brought many benefits to our lives, numerous dysfunctions, such as Internet-based cybercrime, have also increased. In order to effectively investigate such Internet-based cybercrime, it is essential to collect, store, and process cyber public information from a digital forensics perspective. However, related laws, such as the current Criminal Procedure Act, have not yet explicitly stipulated cyber public information forensics, or deletion of the original data, may also be one of the reasons for this occurrence. In this paper, we propose a novel architecture in processing and disclosing cyber information forensics tool for investigation to secure the legal evidence capability of cyber disclosure information collected between effective investigations and investigations of cybercrime. We also present a technical approach from a digital forensics perspective to demonstrate the integrity, identity, reproducibility, and authenticity of digital evidence to be observed while collecting and storing cyber disclosure information using the proposed tool.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr