Search : [ keyword: similarity ] (39)

Information Retrieval-based Bug Localization for Korean Bug Reports using Translation

Misoo Kim

http://doi.org/10.5626/JOK.2024.51.9.827

Information retrieval-based bug localization technique uses bug reports as queries to automatically identify faulty source files, significantly reducing the time developers spend locating bugs. The core of this technique lies in calculating text similarity between bug reports and source files. However, for bug reports written in Korean, the text similarity might not be effective due to difficulty of matching words with source codes primarily written in English. This study proposed an information retrieval-based bug localization technique for Korean bug reports using translation, enabling Korean developers to effectively use this technique. We also applied a soft voting method to effectively leverage outputs of multiple translators. To validate the performance of the proposed technique, we collected 269 Korean bug reports and conducted experiments using three translators and two ranking models. Experimental results showed that the proposed method improved bug localization performance by 44% compared to baselines.

Hierarchical Latent Representation-based Framework for Automatic Detection of Cybercrime Slang

Yong-Yeon Kim, Byung-Won On

http://doi.org/10.5626/JOK.2023.50.12.1121

Cybercriminals constantly produce and use slang by adding criminal meanings to existing words or replacing them with similar words for communication. Continuous monitoring and manual work are required to respond to this, and a large amount of labeled training data is required when using deep learning. However, the ability to collect a large amount of training data is limited because direct labeling by a person requires a lot of time and money and proceeds secretly due to the nature of cybercrime. Thus, we develop a framework based on an autoencoder and propose a method to effectively detect contextual cybercrime slang and neologisms through hierarchical latent vector similarity comparisons to address these limitations. Experiments using a cybercrime post dataset showed that the framework had an accuracy of up to 99.1% at a similarity threshold of 0.5.

Biometrics Performance Improvement of Face Recognition Smart Door Using Binary Classifier

Taeseong Kim, Changsoo Eun, Jongwon Park

http://doi.org/10.5626/JOK.2023.50.7.598

Face recognition based smart door is a biometric system that collects images using a camera and decides whether a visitor is registered by recognizing the face. Recently, with increasing number of single-person households, demand for access convenience has increased. Accordingly, research on smart doors using face recognition method is active. Face recognition based smart doors use deep learning method to recognize visitor"s faces. Difference between the visitor"s face and the registrant"s face is converted into a distance through encoding. If the distance between the two faces is less than the threshold value, the door is opened as it is determined to be the same person. Facial similarity thresholds differ according to region, race, and clothing cultures. Also, biometrics performance varies according to threshold settings. In previous studies, a constant of 0.4 was used as the facial similarity threshold, which was the criterion for determining registration. In this paper, facial similarity thresholds were calculated using five binary classifiers and biometric performance was compared. As a result of the experiment using the LFW dataset, the average EER was improved by 16.59% compared to that when the constant was used.

Zero-Shot Solar Power Efficiency Prediction Method Considering PCC-Based Climate Similarity

Dongjun Kim, Sungwoo Park, Jaeuk Moon, Eenjun Hwang

http://doi.org/10.5626/JOK.2023.50.7.581

Thermal power generation is a power generation method that occupies a large proportion in Korea and abroad due to its low unit price. However, due to its disadvantage of emitting large amounts of harmful substances that can cause health and environmental problems, renewable energy is in the spotlight as an alternative power source. Among various renewable energy generation methods, solar power generation is receiving the most attention because of its advantages such as ease in maintenance. Various solar power generation forecasting studies are being conducted to improve the uncertainty of volatile solar power generation and ensure stability in power supply. However, existing studies have limitations in that they are only applicable when there is a sufficient amount of historical power generation data. Therefore, this paper proposes a solar power generation efficiency prediction method based on zero-shot learning that utilizes historical data of similar regions by concerning weather similarity to solve the cold-start problem, a problem that occurs in prediction when historical data in the target region are lacking. Comparison results revealed that the proposed method had better performance overall in the target area, with a one-hour-based method showing the best prediction performance among other criteria.

Context Based Real-time Korean Writing Correction for Foreigners

Young-Keun Park, Jae-Min Kim, Seong-Dong Lee, Hyun Ah Lee

http://doi.org/10.5626/JOK.2017.44.10.1087

Educating foreigners in Korean language is attracting increasing attention with the growing number of foreigners who want to learn Korean or want to reside in Korea. Existing spell checkers mostly focus on native Korean speakers, so they are inappropriate for foreigners. In this paper, we propose a correction method for the Korean language that reflects the contextual characteristics of Korean and writing characteristics of foreigners. Our method can extract frequently used expressions by Koreans by constructing syllable reverse-index for eojeol bi-gram extracted from corpus as correction candidates, and generate ranked Korean corrections for foreigners with upgraded edit distance calculation. Our system provides a user interface based on keyboard hooking, so a user can easily use the correction system along with other applications. Our system improves the detection rate for foreign language users by about 45% compared to other systems in foreign language writing environments. This will help foreign users to judge and correct their own writing errors.

Image Quality Assessment Considering both Computing Speed and Robustness to Distortions

Suk-Won Kim, Seongwoo Hong, Jeong-Chan Jin, Young-Jin Kim

http://doi.org/10.5626/JOK.2017.44.9.992

To assess image quality accurately, an image quality assessment (IQA) metric is required to reflect the human visual system (HVS) properly. In other words, the structure, color, and contrast ratio of the image should be evaluated in consideration of various factors. In addition, as mobile embedded devices such as smartphone become popular, a fast computing speed is important. In this paper, the proposed IQA metric combines color similarity, gradient similarity, and phase similarity synergistically to satisfy the HVS and is designed by using optimized pooling and quantization for fast computation. The proposed IQA metric is compared against existing 13 methods using 4 kinds of evaluation methods. The experimental results show that the proposed IQA metric ranks the first on 3 evaluation methods and the first on the remaining method, next to VSI which is the most remarkable IQA metric. Its computing speed is on average about 20% faster than VSI’s. In addition, we find that the proposed IQA metric has a bigger amount of correlation with the HVS than existing IQA metrics.

Similarity-based Service Recommendation for Service-Mashup Developers

HyunSeung Kim, InYoung Ko

http://doi.org/10.5626/JOK.2017.44.9.908

As web service technologies are widely used, there have been many efforts to develop approaches for recommending appropriate web services to users in complex and dynamic service environments. In addition, for the effective development of service mashups, service recommender systems that are specialized for service composition have been developed. However, existing service recommender systems for service mashups are not effective at recommending services in a personalized manner that reflect developers’ preferences. To deal with this issue, we propose an approach that recommends services based on the similarities between mashup developers who have developed similar service mashups. The proposed approach is then evaluated by using the mashup data retrieved from ProgrammableWeb. The evaluation results clearly show that the proposed approach is an effective way of improving service recommendations compared to the traditional user-based collaborative filtering algorithm.

Face Detection using Orientation(In-Plane Rotation) Invariant Facial Region Segmentation and Local Binary Patterns(LBP)

Hee-Jae Lee, Ha-Young Kim, David Lee, Sang-Goog Lee

http://doi.org/10.5626/JOK.2017.44.7.692

Face detection using the LBP based feature descriptor has issues in that it can not represent spatial information between facial shape and facial components such as eyes, nose and mouth. To address these issues, in previous research, a facial image was divided into a number of square sub-regions. However, since the sub-regions are divided into different numbers and sizes, the division criteria of the sub-region suitable for the database used in the experiment is ambiguous, the dimension of the LBP histogram increases in proportion to the number of sub-regions and as the number of sub-regions increases, the sensitivity to facial orientation rotation increases significantly. In this paper, we present a novel facial region segmentation method that can solve in-plane rotation issues associated with LBP based feature descriptors and the number of dimensions of feature descriptors. As a result, the proposed method showed detection accuracy of 99.0278% from a single facial image rotated in orientation.

Impact of Diverse Document-evaluation Measure-based Searching Methods in Big Data Search Accuracy

Ji young Kim, DaHyeon Han, Jongkwon Kim

http://doi.org/

With the rapid growth of Big Data, research on extracting meaningful information is being pursued by both academia and industry. Especially, data characteristics derived from analysis, and researcher intention are key factors for search algorithms to obtain accurate output. Therefore, reflecting both data characteristics and researcher intention properly is the final goal of data analysis research. The data analyzed properly can help users to increase loyalty to the service provided by company, and to utilize information more effectively and efficiently. In this paper, we explore various methods of document-evaluation, so that we can improve the accuracy of searching article one of the most frequently searches used in real life. We also analyze the experiment result, and suggest the proper manners to use various methods.

A Traffic-Classification Method Using the Correlation of the Network Flow

YoungHoon Goo, Kyuseok Shim, Sungho Lee, Baraka D. Sija, MyungSup Kim

http://doi.org/

Presently, the ubiquitous emergence of high-speed-network environments has led to a rapid increase of various applications, leading to constantly complicated network traffic. To manage networks efficiently, the traffic classification of specific units is essential. While various traffic-classification methods have been studied, a methods for the complete classification of network traffic has not yet been developed. In this paper, a correlation model of the network flow is defined, and a traffic-classification method for which this model is used is proposed. The proposed network-correlation model for traffic classification consists of a similarity model and a connectivity model. Suggestion for the effectiveness of the proposed method is demonstrated in terms of accuracy and completeness through experiments.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr