Predicting the Cache Performance Benefits for In-memory Data Analytics Frameworks

Minseop Jeong, Hwansoo Han

http://doi.org/10.5626/JOK.2021.48.5.479

In-memory data analytics frameworks provide intermediate results in caching facilities for performance. For effective caching, the actual performance benefits from cached data should be taken into consideration. As existing frameworks only measure execution times at the distributed task level, they have limitations in predicting the cache performance benefits accurately. In this paper, we propose an operator-level time measurement method, which incorporates the existing task-level execution time measurement with our cost prediction model according to input data sizes. Based on the proposed model and the execution flow of the application, we propose a prediction method for the performance benefits from data caching. Our proposed model provides opportunities for cache optimization with predicted performance benefits. Our cost model for operators showed prediction error rate of 7.3% on average, when measured with 10x input data. The difference between predicted performance and actual performance wes limited to within 24%.

Stochastic Response Time Analysis for Autonomous Vehicle Computing Systems

Hyoeun Lee, Kanghee Kim, Kilho Lee

http://doi.org/10.5626/JOK.2021.48.5.486

This paper presents a method of probabilistically analyzing the end-to-end response time from sensing to actuation in autonomous vehicle computing systems. The end-to-end response time is used to evaluate vehicle responsiveness and to derive various indicators of vehicle safety. For example, given the end-to-end response time from sensing an obstacle to stopping the vehicle, an upper limit of the vehicle’s speed may be defined. In addition, the proposed analysis may be used to determine how many computing resources should be invested in improving vehicle responsiveness. This paper proposes a safe analytical method under the assumption that ERF (Earliest Release First) scheduling is used and that each task is pinned to a certain CPU, and presents the results of the responsiveness of an open source autonomous driving stack called Autoware.

Performance Improvement of Neural Network-based Detection of ROP Attacks using Abstraction of Instruction Features

Hyungyu Lee, Changwoo Pyo

http://doi.org/10.5626/JOK.2021.48.5.493

Return-oriented programming (ROP) is a program attack technique that executes code snippets in memory following an attacker-intended order using return instructions. This paper proposes a method of detecting ROP attacks using neural networks. The method reduces the size of the data by using abstraction of instruction features relevant to ROP attacks rather than entire bits of instructions and activates the neural networks only for 12 instructions after a return instruction. Our experiments on a web server, browser, and the necessary libraries show speedups of 9.6 and 1,403.1 over DeepCheck and HeNet with an F1 score of 100.

2-Phase Passage Re-ranking Model based on Neural-Symbolic Ranking Models

Yongjin Bae, Hyun Kim, Joon-Ho Lim, Hyun-ki Kim, Kong Joo Lee

http://doi.org/10.5626/JOK.2021.48.5.501

Previous researches related to the QA system have focused on extracting exact answers for the given questions and passages. However, when expanding the problem from machine reading comprehension to open domain question answering, finding the passage containing the correct answer is as important as machine reading comprehension. DrQA reported that Exact Match@Top1 performance decreased from 69.5 to 27.1 when the QA system had the initial search step. In the present work, we have proposed the 2-phase passage reranking model to improve the performance of the question answering system. The proposed model integrates the results of the symbolic and neural ranking models to re-rank them again. The symbolic ranking model was trained based on the CatBoost algorithm and manual features between the question and passage. The neural model was trained based on the KorBERT model by fine-tuning. The second stage model was trained based on the neural regression model. We maximized the performance by combining ranking models with different characters. Finally, the proposed model showed the performance of 85.8% via MRR and 82.2% via BinaryRecall@Top1 measure while evaluating 1,000 questions. Each performance was improved by 17.3%(MRR) and 22.3%(BR@Top1) compared with the baseline model.

Number Normalization in Korean Using the Transformer Model

Jaeyoon Chun, Chansong Jo, Jeongpil Lee, Myoung-Wan Koo

http://doi.org/10.5626/JOK.2021.48.5.510

Text normalization is a significant component of text-to-speech (TTS) systems. Since numbers in Korean are read in various ways according to their context, number normalization in Korean is crucial to improving the quality of TTS systems. However, the existing model is based on ad hoc rules that are inappropriate for normalizing non-standard numbers. The purpose of this study was to propose a model of number normalization in Korean based on the sequence-to-sequence Transformer model. Moreover, number positional encoding was added to the model to handle long numbers. Overall, the proposed model achieved 98.80% f1 score in the normal test dataset and 90.1% in the non-standard test dataset, which were 2.52% and 19% higher, respectively, than the baseline model. In addition, the proposed model demonstrated a 13% improvement in the longer-number test dataset compared to the other deep learning models.

An Open Source Based Assistive Clinical Application Development for Movement Disorder

Jaekyung Ha, Hyojeong Hwangbo, Hyojeong Gwon, Jihyun Jang, Suhan Bae, Daeun Yun, Yeonsu Kim, Hyunjin Yun, Yein Lee, Young Goo Kim, Minkyu Ahn

http://doi.org/10.5626/JOK.2021.48.5.518

Movement disorders, such as Parkinson"s disease and Essential tremor, are usually assessed using behavioral tests or clinical questionnaires. The assessment typically involves filling in questionnaires or observation by clinicians. Because of the nature of the process, data and records are not well organized and managed; consequently, accurate assessment may be difficult. In this study, we developed an open-source based application running on Tablet PC. Two clinical questionnaires (e.g. Unified Parkinson’s disease Rating Scale and Clinical rating scale of tremor) were implemented, and the application provides efficient management of assessment records. Two behavioral tests (e.g. line and spiral drawing) that are frequently used are also implemented for quick diagnosis. The source code for the application and user/developer manuals are currently open through the web repository (github), thus anyone can easily download and uses it or clinical or research purposes.

Improved Stratified Sampling Using Dart Throwing

Jieun Ko, Sungkil Lee

http://doi.org/10.5626/JOK.2021.48.5.527

This paper presents a stratified sampling technique in which sampling areas are chosen through the dart-throwing technique. Stochastic sampling techniques, which are used in various fields such as image processing and computer graphics, produce images of high quality as samples distribute uniformly and randomly. The blue-noise pattern removes low-frequency areas that are critical in the aliasing problem, but computing such a pattern is costly. To address this issue, stratified sampling methods have been proposed; however, the method demonstrates low randomness due to the structure. This paper proposes a technique to increase randomness by determining the sampling area in the stratified structure through the dart-throwing technique. Our method randomly samples areas in which samples are jittered.

Relation Extraction based on Neural-Symbolic Structure

Jinyoung Oh, Jeong-Won Cha

http://doi.org/10.5626/JOK.2021.48.5.533

Deep learning has been continually demonstrating excellent performance in the field of natural language processing. However, enormous training data and long training time are required to achieve good performance. Herein, we propose a method that exceeds deep learning performance in a small learning data environment by using a neural-symbolic method for the relationship extraction problem. We have designed a structure that uses the inconsistency between the rule results and deep learning results. In addition, logical rule filtering has been proposed to improve the convergence speed and a context has been added to improve the performance of the rule. The proposed method showed excellent performance for a small amount of training data, and we confirmed that fast performance convergence was achieved.

A Span Matrix-based Answer Candidates Detection Model used 2-Step Learning

Boeun Kim, Youngjin Jang, Harksoo Kim

http://doi.org/10.5626/JOK.2021.48.5.539

Automatic data construction refers to a technology that automatically constructs data through algorithms or deep neural networks. The automated construction system of question-answer data aimed at in this paper was mainly studied through a question generation model, which signifies a model that generates questions related to a given paragraph. Previously, paragraph and answer candidates were entered into the question generation model and related questions were generated. The answer candidates" input to the question generation model was detected through a rule-based method or a method using a deep neural network. We judged that answer detection, which is a subtask of question generation, will have a great influence on question generation. Consequently, we have proposed answer candidates detection model and 2-step learning method using Span Matrix. An experiment was conducted to find out how the questions generated through various methods of extracting answer candidates affect the question-answering system. The proposed model extracted a large number of correct answers compared to the existing model, and the noise in the learning process was supplemented by using the entity name dataset. Apparently, it was confirmed that the question-answer data generated as answer candidates extracted by the proposed model contributed the most to the performance of the question-answer system.

Automatic Pancreas Segmentation Based on Cascaded Network Considering Pancreatic Uncertainty in Abdominal CT Images

Hyeon Dham Yoon, Hyeonjin Kim, Helen Hong

http://doi.org/10.5626/JOK.2021.48.5.548

Pancreas segmentation from abdominal CT images is a prerequisite step for understanding the shape of the pancreas in pancreatic cancer detection. In this paper, we propose an automatic pancreas segmentation method based on a deep convolutional neural network(DCNN) that considers information about the uncertain regions generated by the positional and morphological diversity of the pancreas in abdominal CT images. First, intensity and spacing normalizations are performed in the whole abdominal CT images. Second, the pancreas is localized using 2.5D segmentation networks based on U-Net on the axial, coronal, and sagittal planes and by combining through a majority voting. Third, pancreas segmentation is performed in the localized volume using a 3D U-Net-based segmentation network that takes into account the information about the uncertain areas of the pancreas. The average DSC of pancreas segmentation was 83.50%, which was 10.30%p, 10.44%p, 6.52%p, 1.14%p, and 3.95%p higher than the segmentation method using 2D U-Net at axial view, coronal view, sagittal view, majority voting of the three planes, and 3D U-Net at localized volume, respectively.

CRF based Named Entity Recognition Using a Korean Lexical Semantic Network

Seoyeon Park, Cheolyoung Ock

http://doi.org/10.5626/JOK.2021.48.5.556

Named Entity Recognition(NER) is the process of classifying words with unique meanings that often appear as OOV within sentence into categories of predefined entities. Recently, many researches have been conducted using deep learning to synthesize the words’ embedding via Convolution Neural Network(CNN), Long Short-Term Memory(LSTM) networks or training language models. However, models using these deep learning network or language model require high performance computing power and have low practicality due to slow speed. For practicality, this paper proposes Conditional Random Field(CRF) based NER model using Korean lexical network(UWordMap). By using hypernym, dependence and case particle information as training feature, our model showed 90.54% point of accuracy, 1,461 sentences/sec processing speed.

A Leader’s Final Decision Classification Model Tested on Meeting Records with BERT

JinYeong Bak, Alice Oh

http://doi.org/10.5626/JOK.2021.48.5.568

The ways in which leaders make decisions affect the performance of the group. To understand these decision-making processes, we first formalize the problem as predicting leaders" decisions from discussion with group members. For this purpose, we introduce conversational meeting records from the annals of the Joseon dynasty. Using this dataset, we develop a Conversational Decision Making Model with BERT (CDMM-B). CDMM-B is a hierarchical structure of RNN and BERT which simultaneously uses both words and speakers. CDMM-B outperforms other baselines in predicting leaders" final decisions. We also investigate the importance of speakers and the order of utterances for the task through an ablation study.

Effect of Denoising Autoencoder in the view of Item Popularity Bias

Jinhong Kim, Jae-woong Lee, Jongwuk Lee

http://doi.org/10.5626/JOK.2021.48.5.575

Denoising autoencoder (DAE) is commonly used in recent recommendation systems. It is a type of Autoencoder that trains by giving noise to the input and has shown improved performance compared to autoencoder. In this paper, we analyze the effect of noise in terms of item popularity to interpret the training of DAE. For analysis, we design the experiment in the following two ways. First, we observe the changes of the learned item vector’s L2-norm by giving noise to the autoencoder. Second, by giving noise only to presampled items by popularity, we anlayze whether the improved performance of the DAE is related to item popularity. Results of the experiment showed that the variance of the item vector norm caused by popularity was reduced by noise, and that the accuracy increased when noise was given to the popular items.

A User-Centric Conversational Service Mashup Model and Engine

Sanghoon Kim, In-Young Ko

http://doi.org/10.5626/JOK.2021.48.5.584

In Internet of Things (IoT) environments, users not only consume services that are provided by IoT devices, but also create their own service mashup applications. Several visual-based approaches have been proposed to support users in creating IoT service mashups. However, as it is not easy for users to understand the visually-represented execution flow of a service mashup, they often find it difficult to create them. This study proposes a conversational service mashup model and an engine, which end-users without programming experience can use to create IoT service mashups through a natural language. The conversational service mashup model comprises four types of keywords to identify user commands. The service mashup engine comprises an interaction manager, a sematic matching module, and a service mashup module. To evaluate the proposed model, we conduct a case study based on a smart home IoT environment scenario. The study results confirm that end-users can easily use the conversational service mashup model and the engine to create required IoT service mashups.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr