Journal of KIISE

Search : [ keyword: MAC ] (115)

Neural Machine Translation has been mainly studied for a Sequence-to-Sequence model using supervised learning. However, since the supervised learning method shows low performance when the data is insufficient, recently, a transfer learning method of fine-tuning using the pre-training model based on a large amount of monolingual data such as BERT and MASS has been mainly studied in the field of natural language processing. In this paper, MASS using the pre-training method for language generation, was applied to the English-Korean machine translation. As a result of the experiment, the performance of the English-Korean machine translation model using MASS showed better performance than the existing models, and the performance of the machine translation model was further improved by applying the relative position representation method to MASS.

Reducing the Learning Time of Code Change Recommendation System Using Recurrent Neural Network

Byeong-il Bae, Sungwon Kang, Seonah Lee

http://doi.org/10.5626/JOK.2020.47.10.948

Since code change recommendation systems select and recommend files that needing modifications, they help developers save time spent on software system evolution. However, these recommendation systems generally spend a significant amount of time in learning accumulated data and relearning whenever new data are accumulated. This study proposes a method to reduce the time spent on learning when using Code change Recommendation System using Recurrent Neural Network (RNN-CRS), which works by avoiding the learning that is unlikely to contribute to new knowledge. For the five products used in the experimental evaluation, our proposed method reduced the time to relearn data and re-generate a learning model by as much as 49.08%-68.15%, and by 10.66% in the least effective case, compared to the existing method.

A Study on the Prediction Accuracy of Machine Learning using De-Identified Personal Information

Hongju Jung, Nayoung Lee, Soo-jin Seol, Kyeong-Seok Han

http://doi.org/10.5626/JOK.2020.47.10.906

The de-identification of personal information is emerging due to the revision of the Personal Information Protection and Personal Information Protection Act. In addition, the use of artificial intelligence and machine learning is becoming a driving force in the Fourth Industrial Revolution. In this paper, we experimentally verify the predictive accuracy of a machine learning decision tree algorithm using de-identified personal information by applying k-anonymity (k=2). The prediction results of the input data are compared to determine the limitations of using de-identified personal information in machine learning. According to the amendment of the Personal Information Protection Act, we propose that when using de-identified personal information in machine learning, the level of personal information de-identification and the analysis algorithm should be considered.

The Method Using Reduced Classification Models for Distributed Processing of CNN Models in Multiple Edge Devices

Junyoung Kim, Jongho Jeon, Minkwan Kee, Gi-Ho Park

http://doi.org/10.5626/JOK.2020.47.8.787

Recently, there have been increasing demands for edge computing that processes data at the end of the network wherein data is collected because of various problems such as network load caused by a large amount of data transfer to a cloud server. However, it is difficult for edge devices to use deep learning applications used in cloud servers because most edge devices at the end of the network have limited performance. To overcome these problems, this paper proposes a distributed processing method that uses reduced classification models to jointly perform inferences on multiple edge devices. The reduced classification models have compressed model weights, and perform inferences for some parts of the total classification labels. The experimental results confirmed that the accuracy of the result of the proposed distributed processing method is similar to the accuracy of the result of the original model, even if the proposed reduced classification models had much less parameters than those of the original model.

A Multi-Omics Data Integration Method and Parametric Analysis on Large-Scale Colon Cancer Data

Inuk Jung

http://doi.org/10.5626/JOK.2020.47.8.779

Research is being conducted to reveal the mechanisms of diseases and organisms through analysis of genomic data including the expression information of genes. At the genomic level, the principles of living organisms or diseases are very complex, as there are many genes involved, and there is a sophisticated regulatory relationship between the genes. Additionally, various omics participate in the gene expression regulation. Recently, the volume of genome data generated yearly is rapidly increasing because of the decrease in the cost of next-generation sequencing. Various new technologies to measure multi-modal omics from one sample are in active development. In this study we conducted a parametric analysis on colon cancer multi-omics data to observe the effect of the sample number and omics objects on the classification of its four subtypes. Two well known multi-omics integration methods and our in-house built method were used for the analysis. As a result, we found that at least 100 samples and less than 5,000 omics objects were required to achieve a satisfactory subtype classification performance. Three different multi-omics analysis methods were compared.

Passage Re-ranking Method Based on Sentence Similarity Through Multitask Learning

Youngjin Jang, Hyeon-gu Lee, Jihyun Wang, Chunghee Lee, Harksoo Kim

http://doi.org/10.5626/JOK.2020.47.4.416

The machine reading comprehension(MRC) system is a question answering system in which a computer understands a given passage and respond questions. Recently, with the development of the deep neural network, research on the machine reading system has been actively conducted, and the open domain machine reading system that identifies the correct answer from the results of the information retrieval(IR) model rather than the given passage is in progress. However, if the IR model fails to identify a passage comprising the correct answer, the MRC system cannot respond to the question. That is, the performance of the open domain MRC system depends on the performance of the IR model. Thus, for an open domain MRC system to record high performance, a high performance IR model must be preceded. The previous IR model has been studied through query expansion and reranking. In this paper, we propose a re-ranking method using deep neural networks. The proposed model re-ranks the retrieval results (passages) through multi-task learning-based sentence similarity, and improves the performance by approximately 8% compared to the performance of the existing IR model with experimental results of 58,980 pairs of MRC data.

A Reference Architecture for Machine Learning-Based Autonomous Systems

MyeongHo Song, SooDong Kim

http://doi.org/10.5626/JOK.2020.47.4.368

Autonomous computing is one of the essential factors for realizing the fourth industrial revolution and a future technology that provides capabilities of autonomous recognition, autonomous judgement, autonomous planning, and autonomous management with automatic systems. With the advent of various sensors and IoT devices, a rich set of context data can be acquired from the environment, and autonomous system technologies with human-machine interface (HMI) enabling the realization of an eco-system wherein a system itself can maintain its best quality by using the acquired context data. However, because of the highly complicated functional and non-functional requirements for realizing autonomous systems, developing such systems becomes more difficult and development productivity becomes much lower. In the paper, we present a reference architecture which can be commonly applied to autonomous systems. The proposed reference architecture includes architecture design, core components, main algorithm, and so on. The reference architecture forms a structural basis of the target system and can guarantee the overall quality and improve development efficiency by reusing the core structure of the reference architecture. Additionally, we apply the reference architecture to two autonomous systems and verify the applicability and practicability of the reference architecture.

Analysis of the Semantic Answer Types to Understand the Limitations of MRQA Models

Doyeon Lim, Haritz Puerto San Roman, Sung-Hyon Myaeng

http://doi.org/10.5626/JOK.2020.47.3.298

Recently, the performance of Machine Reading Question Answering (MRQA) models has surpassed humans on datasets such as SQuAD. For further advances in MRQA techniques, new datasets are being introduced. However, they are rarely based on a deep understanding of the QA capabilities of the existing models tested on the previous datasets. In this study, we analyze the SQuAD dataset quantitatively and qualitatively to demonstrate how the MRQA models answer the questions. It turns out that the current MRQA models rely heavily on the use of wh-words and Lexical Answer Types (LAT) in the questions instead of using the meanings of the entire questions and the evidence documents. Based on this analysis, we present the directions for new datasets so that they can facilitate the advancement of current QA techniques centered around the MRQA models.

Space Efficient Top-k Query Encoding Based on Data Distribution

Wooyoung Park, Srinivasa Rao Satti

http://doi.org/10.5626/JOK.2020.47.3.235

We consider an encoding that supports a range top-k query on a two-dimensional array without accessing the original array. We propose a more space-efficient encoding method for top-k query with better average-case query time. Our experiments also show that our encoding is more space-efficient than the earlier ones. Also, based on the learning-based data structure, we propose the use of the learning-based data structure on succinct data structures.

An Autonomous IoT Programming Paradigm Supporting Neuromorphic Models and Machine Learning Models

Sanglok Yoo, Keonmyung Lee, Youngsun Yun, Jiman Hong

http://doi.org/10.5626/JOK.2020.47.3.310

The demands and expectations of the IoT (Internet of Things) application services are increasing with the development of sensor technology and high-speed communication infrastructures. Even with many sensors operating and networked, transmission of all the sensor data to the server for processing is inefficient in terms of communication bandwidth and storage space. Meanwhile, with the recent development of artificial intelligence technology, the demand for intelligent processing of the IoT is increasing. This paper proposes a programming paradigm that can apply neuromorphic model-based models and machine learning models relative to IoT clients, and a programming paradigm that applies machine learning models and knowledge processing models relative to IoT servers. The proposed programming paradigm is expected to be valuable for the intelligent IoT as well as for autonomous IoT environments in that various AI modules can be applied relative to IoT clients and server programs.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Journal of KIISE

Journal of KIISE

Digital Library[ Search Result ]

English-Korean Neural Machine Translation using MASS with Relative Position Representation

Reducing the Learning Time of Code Change Recommendation System Using Recurrent Neural Network

A Study on the Prediction Accuracy of Machine Learning using De-Identified Personal Information

The Method Using Reduced Classification Models for Distributed Processing of CNN Models in Multiple Edge Devices

A Multi-Omics Data Integration Method and Parametric Analysis on Large-Scale Colon Cancer Data

Passage Re-ranking Method Based on Sentence Similarity Through Multitask Learning

A Reference Architecture for Machine Learning-Based Autonomous Systems

Analysis of the Semantic Answer Types to Understand the Limitations of MRQA Models

Space Efficient Top-k Query Encoding Based on Data Distribution

An Autonomous IoT Programming Paradigm Supporting Neuromorphic Models and Machine Learning Models

Search

Editorial Office