COVID-19 Virus Whole-genome Embedding Strategy through Density-based Clustering and Deep Learning Model

Minwoo Pak, Sangseon Lee, Inyoung Sung, Yunyol Shin, Inuk Jung, Sun Kim

http://doi.org/10.5626/JOK.2022.49.4.261

The rapid spread of the COVID-19 throughout the world has made the causative virus Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) one of the major targets for research in various fields such as genetics and vaccinology. In particular, studies regarding the phylogeny and subtype properties are of especially great importance due to the variety of subtypes and high variability. However, most computational approaches to studying the viral genome are based on the frequencies of single-nucleotide polymorphisms (SNPs) since the large size of the genomic sequence makes it almost impossible to encode the information of the whole genome at once. In this study, we introduce an alternative embedding strategy to extract information from the SARS-CoV2 whole genome using the density-based clustering algorithm MUTCLUST and deep learning. We first reduced the size of the genome by identifying densely mutated clusters as important regions using MUTCLUST. We then learned the subtype-specific embedding vectors from the extracted clusters using a sequence convolutional deep learning model. We found that the learned embeddings contained information that could be used to discriminate known subtypes and reconstruct phylogenetic trees.

MQSim-E: Design and Implementation of an NVMe SSD Simulator for Enterprise SSDs

Duwon Hong, Dusol Lee, Jihong Kim

http://doi.org/10.5626/JOK.2022.49.4.271

In the study of storage systems such as SSD, a simulator that accurately mimic the operation of SW/HW inside the system plays an important role. In this paper, MQSim, which is widely used in research on NVMe SSDs, was shown to be inappropriate for the development of enterprise-SSD, and we propose an MQSim-E simulator that supports optimized techniques adopted in enterprise-SSD. MQSim-E fully utilizes the parallelism of flash memory and minimizes the performance overhead of garbage collection, improving IOPS, which is an important design goal for enterprise-SSDs, by up to 210% and reducing tail latency by up to 16,000% compared to the existing simulator (MQSim) to accurately reflect the characteristics of commercial enterprise SSDs.

Validation of Intelligent Integrated Management Platform Capabilities based on a Large Virtual HPC Testbed

Seungwoo Rho, Jinseung Ryu, Sangwan Kim, Kwang Jin Oh, MyoungHwan Yoo

http://doi.org/10.5626/JOK.2022.49.4.276

This paper introduces an intelligent integrated management platform developed by itself to manage high-performance computers equipped with board management controller (BMC) functions, and presents large-scale virtual High Performance Computing (HPC) testbeds and experimental results to verify this platform. Intelligent integrated management platforms can monitor and control the hardware sensors of existing high-performance computers using an Intelligent Platform Management Interface (IPMI) to communicate with the BMC. In addition, a separate agent module operated within the controller was developed and applied to expand the function and performance of the BMC in a high performance computer developed in Korea. In this paper, we introduced an intelligent integrated management platform, built 1,200 virtual HPC testbeds, and verified their functions after linking them to the same integrated management platform as the actual physical server.

Confident Multiple Choice Learning-based Ensemble Model for Video Question-Answering

Gyu-Min Park, A-Yeong Kim, Seong-Bae Park

http://doi.org/10.5626/JOK.2022.49.4.284

The task of Video Question Answering(VQA) focuses on finding an answer to a question about the given video. VQA models should be able to process the multi-modal information and time-series information in the video in order to answer the questions appropriately. However, designing a model that answers all types of questions robustly is a challenging problem and takes a lot of time. Since the method of combining existing proposed models has different viewpoints of representing video by each model, ensemble models and ensemble learning methods that can reflect each model"s viewpoints are essential to improve the performance. This paper proposes an ensemble model for VQA with Confident Multiple Choice Learning(CMCL) to improve the performance on accuracy. Our experiment shows that the proposed model outperforms other VQA models and ensemble learning methods on the DramaQA dataset. We analyze the impact of the ensemble learning methods on each model.

A Method to Enhance the Accuracy of Braille Block Recognition for Walking Assistance of the Visually Impaired: Use of YOLOv5 and Analysis of Vertex Coordinates

Junekoo Kang, Valentin Bajeneza, Somyeong Ahn, Minwoo Sung, Youngseok Lee

http://doi.org/10.5626/JOK.2022.49.4.291

In this paper, the method to enhance the accuracy of braille block recognition through a camera to help visually impaired pedestrians is proposed. This paper proposes an accuracy improvement technique that extracts the regions of interest (ROI) by detecting braille blocks in real time through a combination of YOLO, binarization, and vertex extraction algorithms. Robust binarization is possible by setting flexible color boundaries for each frame so that the extracted ROI can be binarized accurately and efficiently. After detecting the vertices in the binarized image, the type of a braille block is confirmed based on number of vertices and information provided in the matching table. Based on the experimental results, this research showed better performance in generating braille block information than existing studies. If you use the proposed method of the braille block recognition technology described in this paper, it will not only help visually impaired pedestrians to walk stably, but will also contribute to the development of walking assistance devices for the visually impaired.

1×1 UWB-based Human Pose Estimation Using Transformer

Seunghyun Kim, Keunhong Chae, Seunghwan Shin, Yusung Kim

http://doi.org/10.5626/JOK.2022.49.4.298

The problem of estimating a human’s pose in specific space from an image is one of the main area of computer vision and is an important technology that can be used in various fields such as games, medical care, disaster, fire fighting, and the military. By combining with machine learning, the accuracy of pose estimation has been greatly improved. However, the image-based approach has a limitation in that it is difficult to estimate pose when part or whole of the body is occluded by obstacles or when the lighting is dark. Recently, studies have emerged to estimate a human pose using wireless signals, which have the advantage of penetrating obstacles without being affected by brightness. The previous stereotype was that two or more pairs of transceivers are required to estimate a specific location based on wireless signals. This paper shows that it is possible to estimate the human pose and to perform body segmentation by applying deep learning only with 1x1 ultra wide band signals collected by 1×1 transceiver. We also propose a method of replacing convolution neural networks and showing better performance through transformer models.

Time-series Location Data Collection and Analysis Under Local Differential Privacy

Kijung Jung, Hyukki Lee, Yon Dohn Chung

http://doi.org/10.5626/JOK.2022.49.4.305

As the prevalence of smart devices that can generate location data, the number of location-based services is exploding. Since the user’s location data are sensitive information, if the original data are utilized in their original form, the privacy of individuals could be breached. In this study, we proposed a time-series location data collection and analysis method that satisfies local differential privacy, which is a strong privacy model for the data collection environment and considers the characteristics of time-series location data. In the data collection process, the location of an individual is expressed as a bit array. After that, each bit of the array is perturbed by randomized responses for privacy preservation. In the data analysis process, we analyzed the location frequency using hidden Markov model. Moreover, we performed additional spatiotemporal correlation analysis, which is not possible in the existing analysis methods. To demonstrate the performance of the proposed method, we generated trajectory data based on the Seoul subway and analyzed the results of our method.

Network-level Tracker Detection Using Features of Encrypted Traffic

Dongkeun Lee, Minwoo Joo, Wonjun Lee

http://doi.org/10.5626/JOK.2022.49.4.314

Third-party trackers breach users’ data privacy by compiling large amounts of personal data such as location or browsing history through web tracking techniques. Although previous research has proposed several methods to protect the users from web tracking via its detection and blockage, their effectiveness is limited in terms of dependency or performance. To this end, this paper proposes a novel approach to detect trackers at the network level using features of encrypted traffic. The proposed method first builds a classification model based on the features extracted from side-channel information of encrypted traffic generated by trackers. It then prevents leakage of user information by accurately detecting tracker traffic within the network independently from the user’s browsers or devices. We validate the feasibility of utilizing features of encrypted traffic in tracker detection by studying the distinctive characteristics of tracker traffic derived from real-world encrypted traffic analysis.

Analyzing the Effect of the Twitter Corpus Selection on the Accuracy of Smartwatch Text Entry

Ku Bong Min, Jinwook Seo

http://doi.org/10.5626/JOK.2022.49.4.321

When a statistical decoder is used to support text entry on a smartwatch, fast and accurate typing is possible. In this paper, we analyzed the effect of a corpus, which is used to construct a language model necessary to implement the autocorrect function, on the accuracy of character input. Language models are based on the Brown corpus, which consists of text of various genres, and the Twitter corpus, extracted from tweet messages. We constructed a statistical decoder for the autocorrect function of the text entry using the two language models, and we simulated user touch input with the dual Gaussian distribution on the smartwatch keyboard to input Enron mobile phrases, composed of phrases written on real mobile devices. The test result shows that the average character error rate (CER) of the Brown corpus and the Twitter corpus is 8.35% and 6.44%, respectively, confirming a statistically significant difference.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr