Performance Analysis of DRAM Cache by Comparing Intel Optane DC Persistent Memory Operating Modes

Yaebin Moon, Deok-Jae Oh, Jung Ho Ahn

http://doi.org/10.5626/JOK.2020.47.10.893

Non-Volatile Memory (NVM) technology is a promising alternative to DRAM technology especially when it comes to the challenge of scaling. Recently, Intel released Optane DC Persistent Memory (DCPMM), a NVM product. The latest Intel server supports two operating modes to exploit this DCPMM: 1) Memory mode uses DCPMM as main memory and DRAM as its cache, and 2) App Direct mode uses DCPMM and DRAM as independent main memory regions, necessitating software modification for efficient utilization. In this paper, we compare the performance of these two operating modes. In the Memory mode, if the working set size of an application is smaller than the DRAM cache size or data locality is guaranteed on the application, the performance reduction caused by accessing the relatively slow DCPMM can be mostly amortized. However, as the working set size becomes larger than the DRAM size, the performance decreases as more accesses are served by DCPMM experiencing additional DRAM cache miss penalties (~70 ns). Therefore, the DRAM cache has a performance limitation due to the DRAM cache miss penalty, and using the App Direct mode may well be better in terms of performance in an environment where the working set is large and there is limited data locality.

Leveraging the Physical Properties of Real Objects to Manage Digital Photography in Augmented Reality

Han Joo Chae, Youli Chang, Minji Kim, Gwanmo Park, Jinwook Seo

http://doi.org/10.5626/JOK.2020.47.10.900

We introduced the concept of physical-object-oriented interaction that provides a natural user experience by leveraging the physical properties of real objects, and the development of ARphy, a tangible interface that enables people to manage and interact with digital photographs using real physical objects in augmented reality (AR). Unlike traditional mobile photo applications, ARphy utilizes the physical attributes and affordances of real objects for more intuitive usages. For example, people can hang travel photos on a souvenir, keep meaningful photos inside a box, or delete photos by putting them into a trash can. We designed the architecture of ARphy for use in various types of AR devices (e.g., mobile devices and headsets). Our qualitative user evaluation demonstrated that ARphy was intuitive, immersive, and fun to use and well-suited for managing digital photos in an AR environment.

A Study on the Prediction Accuracy of Machine Learning using De-Identified Personal Information

Hongju Jung, Nayoung Lee, Soo-jin Seol, Kyeong-Seok Han

http://doi.org/10.5626/JOK.2020.47.10.906

The de-identification of personal information is emerging due to the revision of the Personal Information Protection and Personal Information Protection Act. In addition, the use of artificial intelligence and machine learning is becoming a driving force in the Fourth Industrial Revolution. In this paper, we experimentally verify the predictive accuracy of a machine learning decision tree algorithm using de-identified personal information by applying k-anonymity (k=2). The prediction results of the input data are compared to determine the limitations of using de-identified personal information in machine learning. According to the amendment of the Personal Information Protection Act, we propose that when using de-identified personal information in machine learning, the level of personal information de-identification and the analysis algorithm should be considered.

Best Practices on Software Development and Management Process for the Republic of Korea Army Information System

Bo Kyung Park, Chae Yun Seo, Ki Du Kim, Jong Hoon Lee, R.Young Chul Kim

http://doi.org/10.5626/JOK.2020.47.10.911

Korea Army HQ Information System Management Group is in absolute need to construct a software development management process to develop high-quality software. Therefore, they tried to partially construct the Army information software development system based on commercial software and operating systems. But the current issue is the complete construction of the open source-based software development and management system. To solve this problem, we propose the means to enhance the previous Army Information System with an open OS-based software development management process for quality improvement of Army software. This approach brings a solution to build an independent, salable and flexible software development management process with all open sources based on CentOS. This process can be easily adapted with some mechanisms such as application of the redefined quality metrics, automatically generating development documentation, and identifying the code complexity on the entire software lifecycle. Also, the process possibly facilitates the development high-quality software. In the future, we will need to extend the process by involving software process training, designing, development, maintenance and establishment at each phase of the lifecycle.

A Study on the Method for Automatically Constructing a Domain Specific Sentiment Lexicon Based Lexical Relation and Contextual Information

Sangmin Park, Byung-Won On

http://doi.org/10.5626/JOK.2020.47.10.926

A sentiment lexicon is a set of sentiment words, each of which has its sentiment polarity, and is used as a basic method for sentiment analysis. However, the meaning of some words can be different or even the original meaning can disappear across domains. As such, many sentiment words are likely to depend on a specific domain. For example, the verb phrase “slept well” usually has a negative meaning, while it has a positive meaning in movie domains. Thus, given a particular domain such as hotel, the sentiment lexicon should be constructed so that many of the domain-dependent words reflect the meaning of the domain. Using the domain-dependent sentiment lexicon will render more accurate results than using existing sentiment lexicons that do not consider domain-dependent words in the sentiment analysis. To build the domain-dependent sentiment lexicons, various studies have been presented, but there are many limitations including the human intervention and the use of local information rather than contextual information. In this paper, we propose a novel method of automatically constructing a domain-dependent sentiment lexicon based on the global and contextual information and an existing sentiment lexicon (i.e., KNU sentiment lexicon, Glove vector, Conjunction relation).

Korean End-to-end Neural Coreference Resolution with BERT

Kihun Kim, Cheonum Park, Changki Lee, Hyunki Kim

http://doi.org/10.5626/JOK.2020.47.10.942

Coreference resolution is a natural language task that identifies a mention that is a coreference resolution in a given document and finds and clusters the mention of the same entity. In the Korean coreference resolution, a method using the end-to-end model that simultaneously performs mention detection and mention clustering, and another method pointer network using the encoder-decoder model were used. The BERT model released by Google has been applied to natural language processing tasks and has demonstrated many performance improvements. In this paper, we propose a Korean end-to-end neural coreference resolution with BERT. This model uses the KorBERT pre-trained with the Korean data and applies dependency parsing results and the named entity recognition feature to reflect the structural and semantic characteristics of the Korean language. Experimental results show that the performance of the CoNLL F1 (DEV) 71.00% and (TEST) 69.01% in the ETRI Q & A domain data set was higher than the previous studies.

Reducing the Learning Time of Code Change Recommendation System Using Recurrent Neural Network

Byeong-il Bae, Sungwon Kang, Seonah Lee

http://doi.org/10.5626/JOK.2020.47.10.948

Since code change recommendation systems select and recommend files that needing modifications, they help developers save time spent on software system evolution. However, these recommendation systems generally spend a significant amount of time in learning accumulated data and relearning whenever new data are accumulated. This study proposes a method to reduce the time spent on learning when using Code change Recommendation System using Recurrent Neural Network (RNN-CRS), which works by avoiding the learning that is unlikely to contribute to new knowledge. For the five products used in the experimental evaluation, our proposed method reduced the time to relearn data and re-generate a learning model by as much as 49.08%-68.15%, and by 10.66% in the least effective case, compared to the existing method.

Incorrect Triple Detection Using Knowledge Graph Embedding and Adaptive Clustering

Won-Chul Shin, Jea-Seung Roh, Young-Tack Park

http://doi.org/10.5626/JOK.2020.47.10.958

Recently, with the increase in the amount of information from the development of the Internet, research using large-capacity knowledge graphs is being actively conducted. Additionally, as knowledge graphs are used for various research and services, there is a need to secure quality knowledge graphs. However, there is a lack of research to detect errors within the knowledge graphs to obtain quality knowledge graphs. Previous studies using the embedding and clustering for error triple detection showed good performance. However, in the process of the cluster optimization, there was a problem that the characteristics of each cluster could not be factored using the same threshold collectively. In this paper, to resolve these problems, we propose an adaptive clustering model in which clustering is conducted by finding and applying the optimum threshold for each cluster with the embedding for knowledge graph for error triple detection in the knowledge graph. To evaluate the performance of the method proposed in this paper, the existing error triple detection studies and comparative experiments were conducted on three datasets, DBpeida, Frebase and WiseKB, and the high performance was confirmed by an average of 5.3% based on the F1-Score.

Noise Injection for Natural Language Sentence Generation from Knowledge Base

Sunggoo Kwon, Seyoung Park

http://doi.org/10.5626/JOK.2020.47.10.965

Generating a natural language sentence from Knowledge base is an operation of entering a triple in the Knowledge base to generate triple information, which is a natural language sentence containing the relationship between the entities. To solve the task of generating sentences from triples using a deep neural network, learning data consisting of many pairs of triples and natural language sentences are required. However, it is difficult to learn the model because the learning data composed in Korean is not yet released. To solve the deficiency of learning data, this paper proposes an unsupervised learning method that extracts keywords based on Korean Wikipedia sentence data and generates learning data using a noise injection technique. To evaluate the proposed method, we used gold-standard dataset produced by triples and sentence pairs. Consequently, the proposed noise injection method showed superior performances over normal unsupervised learning on various evaluation metrics including automatic and human evaluations.

A Study on Development of Technology to Improve Imbalanced Data Problems in Numerical Dataset Using Tomek Links Method combined with Balancing GAN

Hyunsik Na, Sohee Park, Daeseon Choi

http://doi.org/10.5626/JOK.2020.47.10.974

Machine Learning is useful due to its good performance and application in various fields such as data classification, voice recognition and predictive models. However, there exists a problem regarding the imbalance between classes in the training dataset, which degrades the classification performance of the minority class. In this paper, we propose a new data augmentation method that combines the Balancing GAN and Tomek Links Method to solve the Imbalanced Data problem and find a clear decision boundary. To verity the proposed method, we have evaluated the performance according to the classification model using five datasets. Moreover, the performance has been compared with Data Sampling and GAN based Data Augmentation Techniques. The results showed that the classification performance was improved or maintained by 0.05~0.195 in 17 of the total 25 performance evaluations. The method proposed in this paper showed the potential as a new method to solve the Imbalanced Data problem.

Visual Commonsense Reasoning with Vision-Language Co-embedding and Knowledge Graph Embedding

Jaeyun Lee, Incheol Kim

http://doi.org/10.5626/JOK.2020.47.10.985

In this paper, we proposed a novel model for Visual Commonsense Reasoning (VCR). The proposed model co-embeds multi-modal input data together using a pre-trained vision-language model to effectively cope with the problem of visual grounding, which requires mutual alignment between an image, a natural language question, and the corresponding answer list. In addition, the proposed model extracts the common conceptual knowledge necessary for Visual Commonsense Reasoning from ConceptNet, an open knowledge base, and then embeds it using a Graph Convolutional neural Network (GCN). In this paper, we introduced the design details of the proposed model, VLKG_VCR, and verified the performance of the model through various experiments using an enhanced VCR benchmark data set.

An Efficient Similarity Measure for Purchase Histories Considering Hierarchical Classification of Products

Yu-Jeong Yang, Ki Yong Lee

http://doi.org/10.5626/JOK.2020.47.10.999

In an online shopping mall or offline store, the products purchased by each customer over time form a purchase history of the customer. Also, in most cases, products have a hierarchical classification that represents their subcategories. In this paper, we propose a new similarity measure for purchase histories considering not only the purchase order of products but also the hierarchical classification of products. The proposed method extends the dynamic time warping similarity that is an existing representative similarity measure for sequences, to reflect the hierarchical classification of products. Unlike the existing method, where the similarity between the elements in two sequences is only 0 or 1 depending on whether the two elements are the same or not, the proposed method can assign any real number between 0 and 1 as the similarity between the two elements considering the hierarchical classification of elements. We also propose an efficient method for computing the proposed similarity measure. The proposed computation method uses the segment tree to evaluate the similarity between the two products in a hierarchical classification tree in an efficient manner. Through various experiments based on the real data, we show that the proposed method can measure the similarity between purchase histories of products with hierarchical classification in an exceedingly effective and efficient manner.

Improvement in Tor-based Dark Web Crawling Performance by Eliminating Web Browser Rendering and Scripting Tasks

Hyunsu Mun, Soohyun Kim, Youngseok Lee

http://doi.org/10.5626/JOK.2020.47.10.1008

The dark web, represented by Tor, has become a place where various illegal services, content, and transactions such as exchanges of drugs, child pornography, weapons, and contracts are conducted because of the anonymity guaranteed by the protocol. The Tor-based dark web service requires at least 3 tunneling nodes, and this makes the Tor-based services 2.2 times slower than the general web. And the slow speed makes difficult to monitor the illegal services which open irregularly. Therefore, this paper proposes a method for improving the speed of collecting Tor-based dark web data by removing rendering and scripting tasks using the Tor Socks5 proxy server. The performance of the existing and proposed crawlers was tested on 651 dark web addresses. By removing rendering and scripting, the collection performance was improved by up to 10.04 times.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr