Compiler-directive based Heterogeneous Computing for Scala

Jungjae Woo, Seongsoo Park, Sungin Hong, Hwansoo Han

http://doi.org/10.5626/JOK.2023.50.3.197

With the advent of the big data era, heterogeneous computing is employed to process large amounts of data. Since Apache Spark, a representative big data analysis framework, is built with the Scala programming language, programs written in Scala need to be rewritten with CUDA, OpenCL, and others to enjoy the benefits of GPU computing. TornadoVM automatically converts Java programs into OpenCL programs using compiler annotations defined in the Java specification. Scala shares bytecode in an executable form with Java, but the annotation capabilities of current Scala compilers lack the annotations indispensable for TornadoVM’s OpenCL translation. In this work, the annotation capabilities of Scala compilers are extended to enable OpenCL translation on TornadoVM. Furthermore, we experimentally confirmed that the performance of Scala-OpenCL converted code is as fast as Java-OpenCL converted code. With our extension, we expect Scala programs to easily use GPU acceleration in the Apache Spark framework.

Shortest Paths Between Line Segments in the Presence of Rectangular Obstacles

Chanyang Seo, Taehoon Ahn, Hee-Kap Ahn

http://doi.org/10.5626/JOK.2023.50.3.204

In this paper, we present an algorithm computing L1 shortest paths between two line segments in the presence of rectangular obstacles. A path between two line segments is the shortest path between two selected points from each line segment. The selected points vary by the definition of a path between the line segments. Among them, we consider minimum shortest path which is defined by the points that minimize the length and maximum shortest path which is defined by the points that maximize the length. We present an O(nlogn)-time algorithm computing minimum shortest path and an O(n2)-time algorithm computing maximum shortest path.

A Survey on Methods for Image Description

Subin Ok, Daeho Lee

http://doi.org/10.5626/JOK.2023.50.3.210

Image description, which has been receiving much attention with the development of deep learning, uses computer vision methods that identify the contents of images and natural language processing methods that represent descriptive sentences. Image description techniques are utilized in many applications including services for visually impaired people. In this paper, we summarize image description methods within three categories; template-based methods, visual/semantic similarity search-based methods, and deep learning-based methods, and compare their performances. Through performance comparison, we try to provide useful information by offering basic architectures, advantages, limitations, and performances of the models. We especially survey the deep learning-based methods in detail because the performances of these methods are significantly improved compared to other methods. Through this process, we aim to organize the overall contents of image description techniques. For the performance of each study, compare the METEOR and BLEU scores for the commonly used Flickr30K and MS COCO datasets, and if the results are not provided, check the test image and the sentences generated for it.

Real-time Multimodal Audio-to-Tactile Conversion System for Playing or Watching Mobile Shooting Games

Minjae Mun, Gyeore Yun, Chaeyong Park, Seungmoon Choi

http://doi.org/10.5626/JOK.2023.50.3.228

This study presents a real-time multimodal audio-to-tactile conversion system for improving user experiences when users play or watch first-person shooting games with a mobile device. The system detects whether sounds from the mobile devices are appropriate to provide haptic feedback in real-time and provides vibrotactile feedback, mainly used for conventional haptic feedback, and impact effects of short and strong force as well. To this end, we confirmed the suitability of the impact haptic feedback compared to the vibrotactile feedback for shooting games. We implemented two types of impulsive sound detectors using psychoacoustic features and a support vector machine. We found that our detectors outperformed the one from a previous study. Lastly, we conduct a user study to evaluate our system. Results showed that our system could significantly improve user experiences.

Integrating Domain Knowledge with Graph Convolution based on a Semantic Network for Elderly Depression Prediction

Seok-Jun Bu, Kyoung-Won Park, Sung-Bae Cho

http://doi.org/10.5626/JOK.2023.50.3.243

Depression in the elderly is a global problem that causes 300 million patients and 800,000 suicides every year, so it is critical to detect early daily activity patterns closely related to mobility. Although a graph-convolution neural network based on sensing logs has been promising, it is required to represent high-level behaviors extracted from complex sensing information sequences. In this paper, a semantic network that structuralizes the daily activity patterns of the elderly was constructed using additional domain knowledge, and a graph convolution model was proposed for complementary uses of low-level sensing log graphs. Cross-validation with 800 hours of data from 69 senior citizens provided by DNX, Inc. revealed improved prediction performance for the suggested strategy compared to the most recent deep learning model. In particular, the inference of a semantic network was justified by a graph convolution model by showing a performance improvement of 28.86% compared with the conventional model.

Epoch Score: Dataset Verification using Quantitative Data Quality Assessment

Sungryeol Kim, Taewook Hwang, Sangkeun Jung, Yoonhyung Roh

http://doi.org/10.5626/JOK.2023.50.3.250

It is tough to determine whether a dataset is suitable for a model or specified field or whether there is an error. In this paper, we propose an Epoch Score that indicates the degree of difficulty of the data as a score using incorrect answer data obtained through learning several times under the same conditions but different seeds. Through this, we verified KLUE"s Topic Classification dataset, and about 0.8% performance improvement derived by correcting high-scoring data, which we judge to have errors. Epoch Score can be used for all supervised learning regardless of the data type, such as natural language or images, and the performance of the model can be inferred by the area the of the Epoch Score.

A Generation Method of Segment-level Fingerprint-based Transformer for Video Partial Copy Detection

Sooyeon Kang, Minsoo Jeong, Jongho Nang

http://doi.org/10.5626/JOK.2023.50.3.257

With the recent generalization of video-capturing devices and the development of various multimedia platforms, video content usage is increasing every year. However, as a side effect of this, copyright infringement crimes regarding video content are also increasing. In this paper, we propose a segment fingerprint generation method for robust video copy detection systems in various transforms to address these problems. We propose a method for generating a frame fingerprint with a hybrid vision transformer, weighting the generated frame fingerprint with a transformer encoder, and fusing it into Maxpooling to aggregate a segment fingerprint. We used the VCDB dataset and measured the F1 score, which was 0.772.

Prediction of Antibiotic Resistance to Ciprofloxacin in Patients with Upper Urinary Tract Infection through Exploratory Data Analysis and Machine Learning

Jongbub Lee, Hyungyu Lee

http://doi.org/10.5626/JOK.2023.50.3.263

Emergency medicine physicians use an empirical treatment strategy to select antibiotics before clinically confirming an antibiotic resistance profile for a patient with a urinary tract infection. Empirical treatment is a challenging task in the context of concern for increased antibiotic resistance of urinary tract pathogens in the community. As a single-institution retrospective study, this study proposed a method for predicting antibiotic resistance using a machine learning algorithm for patients diagnosed with upper urinary tract infection in the emergency department. First, we selected significant predictors using statistical test methods and a game theory based SHAP (SHapley Additive exPlanation), respectively. Next, we compared four classifier performances and proposed an algorithm to assist decision-making in empirical treatment by adjusting the prediction probability threshold. As a result, the SVM classifier using predictors selected through SHAP (65% of the total) showed the highest AUROC (0.775) among all conditions used in the experiment. By adjusting the predictive probability threshold in the SVM, we achieved classification accuracy with a specificity that was 3.9 times higher than empirical treatment while preserving the sensitivity of the doctor"s empirical treatment at 98%.

Performance Improvement of a Korean Open Domain Q&A System by Applying the Trainable Re-ranking and Response Filtering Model

Hyeonho Shin, Myunghoon Lee, Hong-Woo Chun, Jae-Min Lee, Sung-Pil Choi

http://doi.org/10.5626/JOK.2023.50.3.273

Research on Open Domain Q&A, which can identify answers to user inquiries without preparing the target paragraph in advance, is currently being undertaken as deep learning technology is used for natural language processing. However, existing studies have limitations in semantic matching using keyword-based information retrieval. To supplement this, deep learning-based information retrieval research is in progress. But there are not many domestic studies that have been empirically applied to real systems. In this paper, a two-step performance enhancement method was proposed to improve the performance of the Korean open domain Q&A system. The proposed method is a method of sequentially applying a machine learning-based re-ranking model and a response filtering model to a baseline system in which a search engine and an MRC model was combined. In the case of the baseline system, the initial performance was an F1 score of 74.43 and an EM score of 60.79, and it was confirmed that the performance improved to an F1 score of 82.5 and an EM score of 68.82 when the proposed method was used.

DNN Retraining Method Reducing Accuracy Degradation in Packet-Lossy Environments

Dongwhee Kim, Yujin Lim, Syngha Han, Jungrae Kim

http://doi.org/10.5626/JOK.2023.50.3.285

Limited resources on mobile devices have necessitated a collaboration with cloud servers, called “Collaborative Intelligence”, to process growing Deep Neural Network (DNN) model sizes. Collaborative intelligence takes a long time to send a lot of feature data from clients to servers. One can reduce the transfer time using User Datagram Protocol (UDP), but a dropped packet during UDP transfer reduces inference accuracy. This paper proposed a DNN retraining method to develop a robust DNN model. The server-side layers are retrained to avoid lossy features by modeling continuous feature losses resulting from a packet drop. Our results showed that it can reduce accuracy reduction from packet losses, provide high accuracy reliability against changes in the communication environment, and reduce the storage overheads of mobile devices.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr