ESP: Improving Performance and Lifetime of High-Capacity 3D Flash Storage Using an Erase-Free Subpage Programming Technique

Myungsuk Kim

http://doi.org/10.5626/JOK.2023.50.1.1

Recent high-capacity 3D NAND flash devices have large page sizes. Although large pages are useful in increasing flash capacity, they can degrade both the performance and lifetime of flash storage systems when small writes are dominant. We propose a new NAND programming scheme, called erase-free sub-page programming (ESP), which allows the same page to be programmed multiple times for small writes without the intervention of the erase operation. By avoiding internal fragmentation, the ESP scheme reduces the overhead of garbage collection for large-page NAND storage. Based on the proposed ESP scheme with an adaptive retention management technique, we implemented an ESP-aware FTL(subFTL) and performed comprehensive evaluations using various benchmarks and workloads. The experimental results showed that an ESP-aware FTL could improve the IOPS and lifetime by up to 74% and 177%, respectively.

Named Entity Tagged Corpus Augmentation Using Automatic Editing

Jae-kyun Kim, Jae-Hoon Kim

http://doi.org/10.5626/JOK.2023.50.1.11

A corpus is an essential resource for machine learning and deep learning in the field of natural language processing. In Korean, there are insufficient well-refined named entity corpus compared to advanced research countries such as the United States, Japan, and China. Most projects for building a named entity corpus proceed manually and/or semi-automatically and thus require a lot of cost and effort. In this paper, we propose a novel method for automatically augmenting a small-sized named entity corpus. The proposed method augments the corpus by automatically editing, for example, substituting, inserting, and deleting. We use probabilistic sampling rather than simple editing to make the augmented corpus natural and diverse. Through experiments, we have shown that the performance of Korean named entity recognition can be improved using the augmented corpus and the proposed method should be used in practice.

Style Transfer for Chat Language using Unsupervised Machine Translation

Youngjun Jung, Changki Lee, Jeongin Hwang, Hyungjong Noh

http://doi.org/10.5626/JOK.2023.50.1.19

Style transfer is the task of generating text of a target style while maintaining content of given text written in a source style. In general, it is assumed that the content is an invariant and the style is variable when the style of the text is transferred. However, in the case of chat language, there is a problem in that it is not well trained by existing style transfer model. In this paper, we proposed a method of transfer chat language into written language using a style transfer model with unsupervised machine translation. This study shows that it is possible to construct a word transfer dictionary between styles that can be used for style transfer by utilizing transferred results. Additionally, it shows that transferred results can be improved by applying a filtering method to transferred result pair so that only well transferred results can be used and by training the style transfer model using a supervised learning method with filtered results.

Multi-task Learning Approach Based on Pre-trained Language Models Using Temporal Relations

Chae-Gyun Lim, Kyo-Joong Oh, Ho-Jin Choi

http://doi.org/10.5626/JOK.2023.50.1.25

In the research on natural language understanding that can perform multiple tasks and produce a model that provides general performance, various studies of multi-task learning techniques are being attempted. In addition, documents written in natural language typically contain time-related information, and accurate recognize such information is essential to understand the overall content and context of the document. In this paper, we propose a multi-task learning technique that incorporates a temporal relation extraction task into the learning process of NLU tasks to use the temporal contextual information of Korean input sentences. In order to reflect the characteristics of multi-task learning, a new task for extracting temporal relations is designed, and the model is configured to learn in conjunction with existing NLU tasks. In the experiment, the difference in performance was analyzed by learning the effect of various task combinations and the temporal relationships compared to the case where only the existing NLU task is used. Through the experimental results, we discuss that the overall performance of the multi-task combination is higher than that of individual tasks, especially when temporal relationship with the name entity recognition shows greatly improved performance.

Korean End-to-End Coreference Resolution with BERT for Long Document

Kyeongbin Jo, Youngjun Jung, Changki Lee, Jihee Ryu, Joonho Lim

http://doi.org/10.5626/JOK.2023.50.1.32

Coreference resolution is a natural language processing task that identifies mentions that are coreference resolution targets, identifies mentions that refer to the same entity, and groups them together. Recently, in coreference resolution, an end-to-end model using BERT to derive the context expression of a word while simultaneously performing mention detection and coreference resolution has been mainly studied. However, BERT has the problem of reduced performance for long documents due to its input length limit. Therefore, in this paper, the following model is proposed. First, a lengthy document is split into tokens of 512 or fewer tokens, extracted from an existing local BERT to obtain the primary contextual expression of a word, and then recombined to compute and add a globalpositional embedding value for the original document. Finally, a coreference resolution was performed by computing the entire context expression with the Global BERT layer. As a result of the experiment, the model proposed in this paper showed similar performance to the existing model, while the GPU memory usage decreased by 1.4 times and the speed improved by 2.1 times.

Single Image Double Averaging for Smartphone Picture Denoising

Hoonmin Cho, Sungkil Lee

http://doi.org/10.5626/JOK.2023.50.1.40

Image-denoising algorithms have long been actively researched to remove noise generated in pixel signals. There is a denoising technique for a single image with Gaussian noise, a technique for removing noise using multiple photos taken by a fixed camera, and a technique for removing noise by learning the difference using deep learning. However, the noise in actual smartphone photographs does not have the same Gaussian distribution at each pixel, and taking multiple photos costs a lot of time. Deep learning disadvantages the ground truth image without noise is essential. Therefore, this paper analyzes the characteristic of noise appearing in images taken with smartphones and uses it for denoising. In addition, a single image containing noise is divided into several small areas, showing similar results to denoising using an average of multiple images. Accordingly, this technique can adequately perform denoising using a single noise image photographed by a smartphone without the ground truth image learning.

Interactive Visual Analytics System for Criminal Intelligence Analysts with Multiple Coordinated Views

Seokweon Jung, Donghwa Shin, Jinwook Bok, Seokhyeon Park, Hyeon Jeon, Jinwook Seo, Insoo Lee, Sooyoung Park

http://doi.org/10.5626/JOK.2023.50.1.47

Data that criminal intelligence analysts have to analyze have become much larger and more complex in recent decades. However, the environment and methods of investigation have not yet kept up with those changes. In this study, we examined current investigation practices in Korean Government Agency. We focused on the sensemaking process of investigation and tried to adopt visual analytics approaches for sensemaking into the investigation. We derived tasks and design requirements and designed a multi-view visual analytics system that could satisfy them. We validated our design with a high-fidelity prototype through a case study to show realistic use cases.

A Visual Analytics System for Interpretable Machine Learning

Chanhee Park, Kyungwon Lee

http://doi.org/10.5626/JOK.2023.50.1.57

Interpretable machine learning is a technology that assists people understand the behavior and prediction of machine learning systems. This study proposes a visual analytics system that can interpret the relationship between how machine learning models relate output results from input data. It supports users to interpret machine learning models easily and clearly. The visual analytics system proposed in this study takes an approach to effectively interpret the machine learning model through an iterative adjustment procedure that filters and groups model decision results according to input variables, target variables, and predicted/classified values. Through use case analysis and in-depth user interviews, we confirmed that our system could provide insights into the complex behavior of machine learning models, gain scientific understanding of input variables, target variables, and model predictions, and help users understand the stability and reliability of models.

Corroboration of Skin Diseases: Measuring the Severity of Vitiligo Using Transfer Learning

YongHo Kwon

http://doi.org/10.5626/JOK.2023.50.1.72

Vitiligo is a commonly acquired skin disorder that results from the loss of melanin pigment from the epidermis and is clinically indicated by pale or white patches on the body. Preliminary treatment is essential for vitiligo, but vitiligo does not cause pain or health problems. Therefore, vitiligo patents are treated when skin lesions are visible on the outside. The subjective judgment treats vitiligo of dermatologist’s, and there is no quantitative and objective analysis method through imaging, because it is difficult to obtain a medical image. Several diagnostic methods have been developed through a few medical studies. In this paper, we propose a method for area of vitiligo through image segmentation using metastasis learning to overcome the limitations of vitiligo medical data collection. The transfer learning model was selected by experimenting with the possibility of application to deep learning models such as U-net, FCN, and Deeplab. In addition, the severity of Vitiligo was measured using the VASI score used in the medical field, converting the skin image into an RGB skin image representing skin areas. In the experimental results, when trained with an imbalanced vitiligo image dataset, the performance of Deeplab, measured by F1-score and IoU, was superior to that of U-net and the image processing method. Additionally, the method for calculating the VASI score in vitiligo image proposed in this paper showed the possibility of being used for vitiligo diagnosis.

CoEM: Contrastive Embedding Mapper for Audio-visual Latents

Gihun Lee, Kyungchae Lee, Minchan Jeong, Myungjin Lee, Se-young Yun, Chan-hyun Yun

http://doi.org/10.5626/JOK.2023.50.1.80

Human perception can link audio-visual information to each other, making it possible to recall visual information from audio information and vice versa. Such ability is naturally acquired by experiencing situations where these two kinds of information are combined. However, it is hard to obtain video datasets that are richly combined with both types of information, and at the same time, labeled for the semantics of each scene. This paper proposes a Contrastive Embedding Mapper (CoEM), which maps embedding from one type of information to the another, corresponding to its categorical modality. Paired data is not required, CoEM learns to contrast the mapped embedding by its categories. We validated the efficacy of CoEM on the embeddings for audio and visual datasets which were trained to classify 20 shared categories. In the experiment, the embedding mapped by CoEM showed that it was capable of retrieving and generating data on its mapped domain.

Performance Improvement of LSM-tree Using Partial Flushing of MemTable

Hyeongjun Jeon, Hera Koo, Sungho Moon, Beomseok Nam

http://doi.org/10.5626/JOK.2023.50.1.87

Key-Value store, which is one of the NoSQL databases, uses Log-Structured Merge Tree(LSM Tree) as its index data structure. LSM Tree normally has good writing performance, but write amplification and write stall as chronic problems in LSM Tree have impeded the write performance of LSM Tree. In this paper, we introduce Extended MemTable which is an extended version of the current LSM Tree’s MemTable considering that recent datacenter’s main memory space is increasing. Extended Memtable uses partition which is divided by key ranges. It does the flush operation in the manner that the compaction operation can be operated effectively. It can increase the write throughput by up to 2 x and the read throughput by up to 4 x while reducing write amplification by up to 3.7 x compared to the original RocksDB by significantly reducing write amplification and write stall problems.

R-FLHE: Robust Federated Learning Framework Against Untargeted Model Poisoning Attacks in Hierarchical Edge Computing

Jeehu Kim, Jaewoo Lee

http://doi.org/10.5626/JOK.2023.50.1.94

Federated learning is a server-client based distributed learning strategy that collects only trained model to guarantee data privacy and reduce communication costs. Recently, research is being conducted to prepare for the future IoT ecosystem by combining edge computing and federated learning. However, research considering vulnerabilities and threat is insufficient. In this paper, we propose Robust Federated Learning in Hierarchical Edge computing (R-FLHE), a federated learning framework for robust global model from untargeted model poisoning attacks. R-FLHE can aggregate models learned from clients, evaluate them on the edge server, and score them based on the calculated model’s loss. R-FLHE can maintain robustness of the global model by sending only the model of the edge server with the best score to the cloud server. The R-FLHE proposed in this paper shows robustness in maintaining constant performance for each federated learning round, with performance drop of only 0.81% and 1.88% on average even if attacks occur.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr