Search : [ author: Seok-Jun Bu ] (6)

Phishing Webpage Detection using URL and HTML Graphs based on a Multimodal AutoEncoder Ensemble

Jun-Ho Yoon, Seok-Hun Choi, Hae-Jung Kim, Seok-Jun Buu

http://doi.org/10.5626/JOK.2025.52.6.461

As the internet continues to evolve, phishing attacks are increasingly targeting users, highlighting the need for effective detection methods. Traditional approaches focus on analyzing URL character sequences; however, phishing URLs often mimic legitimate patterns and have a short lifespan, limiting detection accuracy. To address this, we propose a multimodal ensemble-based phishing detection method that leverages both URL strings and HTML graph data. Character-level URL sequences are processed using a Convolutional AutoEncoder (CAE), while HTML DOM structures are converted into graph formats and analyzed with a Graph Convolutional AutoEncoder (GCAE). The extracted latent vectors are integrated via a Transformer layer to classify phishing webpages. The proposed model improves detection performance by up to 18.91 percentage points in F1 Score compared to existing methods, and case analysis reveals the interrelationship between URL and HTML features.

Adaptive Database Intrusion Detection based on Michigan-style Deep Learning Classifier System

Seok-Jun Bu, Sung-Bae Cho

http://doi.org/10.5626/JOK.2023.50.10.891

In a role-based access control (RBAC) environment, database intrusion detection can be achieved by designing a role classifier for query transactions and determining it as an intrusion when the predicted role differs from the actually performed role. The current query-role classifier design methods utilize deep learning models, but it was difficult to simultaneously achieve high accuracy and incomplete adaptability for changing patterns. To solve this problem, this study proposes a Michigan-style Deep Learning Classifier System (MDLCS). This method applies a divide-and-conquer strategy that divides the input space into patterns and assigns an optimal classifier, combining the evolutionary computation principle of a Michigan-style learning classifier system with a deep learning classifier to adapt and improve detection performance for real-time changing patterns.The proposed MDLCS method provides strong adaptability and robustness compared to existing intrusion detection methods such as anomaly detection, signature-based detection and behavior-based detection. MDLCS was evaluated in a commercial database following the TPC-E schema and achieved a 26.81%p improved detection performance compared to existing methods under real environmental conditions in which new patterns sequentially emerge.

Learning Functional Characteristics of Malware Attacks with Graph Transformer based on Control Flow

Seok-Jun Bu, Sung-Bae Cho

http://doi.org/10.5626/JOK.2023.50.8.633

To minimize false negatives in malware classification, it is important to capture local characteristics of a program, such as the control flow between operation blocks and memory-register addresses. However, existing methods that optimize the loss function of a classifier without considering the functional characteristics of malware have limitations in recall due to new attack paths and complex control flow graphs. In this paper, we propose a method that explicitly samples and embeds the control flow graphs to learn functional characteristics, such as API calls, rootkit DLL installation, and specific virtual memory access, and improve recall. To model the functional patterns of malware from the control flow graphs, we sample attack paths from the control flow of the malware and classify the types of malware using a graph embedding function based on the transformer. We evaluate the proposed method using a real-world malware benchmark dataset, Microsoft Challenge. By explicitly learning the control flow of the malware, we achieved a recall of 97.89% and significantly improved the accuracy (99.45%) compared to the latest and most advanced method"s classification accuracy (97.89%).

Integrating Domain Knowledge with Graph Convolution based on a Semantic Network for Elderly Depression Prediction

Seok-Jun Bu, Kyoung-Won Park, Sung-Bae Cho

http://doi.org/10.5626/JOK.2023.50.3.243

Depression in the elderly is a global problem that causes 300 million patients and 800,000 suicides every year, so it is critical to detect early daily activity patterns closely related to mobility. Although a graph-convolution neural network based on sensing logs has been promising, it is required to represent high-level behaviors extracted from complex sensing information sequences. In this paper, a semantic network that structuralizes the daily activity patterns of the elderly was constructed using additional domain knowledge, and a graph convolution model was proposed for complementary uses of low-level sensing log graphs. Cross-validation with 800 hours of data from 69 senior citizens provided by DNX, Inc. revealed improved prediction performance for the suggested strategy compared to the most recent deep learning model. In particular, the inference of a semantic network was justified by a graph convolution model by showing a performance improvement of 28.86% compared with the conventional model.

An Autism Spectrum Disorder Detection System Based on Learning Dynamic Connectivity of the Superior Temporal Sulcus

Kyoung-Won Park, Seok-Jun Bu, Sung-Bae Cho

http://doi.org/10.5626/JOK.2022.49.5.354

Considering a hypothesis that abnormalities in the superior temporal sulcus (STS) connected with visual cortex regions can be a critical sign of ASD, autism spectrum disorder, a model is required to exploit the brain functional connectivity between the STS and visual cortex to reinforce the neurobiological evidence. This paper proposes a deep learning model comprising attention and convolutional recurrent neural networks that can select and extract the time-series pattern of dynamic connectivity between the two regions within the brain based on observations. By integration of the extracted autism disorder features from dynamic connectivity through attention with the structure containing interlayer connections to preserve the functional connectivity loss within a neural network, the model extracts the connectivity between the STS and visual cortex, leading to an increase in generalization performance. A 10-fold cross-validation to compare the performance shows that the proposed model outperforms the state-of-the-art models by achieving an improvement of 4.90% in the ASD classification. Additionally, we use the proposed method to diagnose ASD by visualizing dynamic brain connectivity of the neural network layers.

Learning Disentangled Representation of Web Addresses via Convolutional-Recurrent Triplet Network for Phishing URL Classification

Seok-Jun Bu, Hae-Jung Kim

http://doi.org/10.5626/JOK.2021.48.2.147

Automated classification of phishing URLs propagated through hyperlinks is critical in environments reinforcing personal connections due to the explosive growth of social media services. Deep learning models for the classification of phishing URLs based on convolutional-recurrent neural networks yielded the best performance in terms of accuracy by modeling the character-level and word-level features. However, the deep learning-based classifier focused on the fitting of a given task via accumulated URLs is limited due to the class imbalance of the phishing attacks that are generated and discarded immediately. We address the class imbalance issue in terms of deep learning-based URL feature space generation task. We propose a modified triplet network structure that explicitly learns the similarity between URLs based on Euclidean distance to alleviate the limitations of the existing deep phishing classifiers. Experiments investigating the real-world dataset of 60,000 URLs collected from web addresses showed the highest performance among the latest deep learning methods, despite the hostile class imbalance. We also demonstrate that the generated URL feature space from the proposed method improved recall by 45.85% compared to the existing methods.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr