Journal of KIISE

Search : [ keyword: prediction ] (66)

Optimizing Throughput Prediction Models Based on Feature Category Contribution in 4G/5G Network Environments

http://doi.org/10.5626/JOK.2024.51.11.961

The acceleration in 5G technology adoption due to increased network data consumption and limitations of 4G has led to the establishment of a heterogeneous network environment comprising both 4G and limited 5G. Consequently, this highlights the importance of throughput prediction for network service quality (QoS) and resource optimization. Traditional throughput prediction research mainly relies on the use of single attributes or extraction of attributes through correlation analysis. However, these approaches have limitations, including potential exclusion of variables with nonlinear relationships with arbitrariness and inconsistency of correlation coefficient thresholds. To overcome these limitations, this paper proposed a new approach based on Feature Importance. This method could calculate the relative importance of features used in the network and assign contribution scores to attribute categories. By utilizing these scores, throughput prediction was enhanced. This approach was applied and tested on four open network datasets. Experiments demonstrated that the proposed method successfully derived an optimal category combination for throughput prediction, reduced model complexity, and improved prediction accuracy compared to using all categories.

Generating Relation Descriptions with Large Language Model for Link Prediction

Hyunmook Cha, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.10.908

The Knowledge Graph is a network consisting of entities and the relations between them. It is used for various natural language processing tasks. One specific task related to the Knowledge Graph is Knowledge Graph Completion, which involves reasoning with known facts in the graph and automatically inferring missing links. In order to tackle this task, studies have been conducted on both link prediction and relation prediction. Recently, there has been significant interest in a dual-encoder architecture that utilizes textual information. However, the dataset for link prediction only provides descriptions for entities, not for relations. As a result, the model heavily relies on descriptions for entities. To address this issue, we utilized a large language model called GPT-3.5-turbo to generate relation descriptions. This allows the baseline model to be trained with more comprehensive relation information. Moreover, the relation descriptions generated by our proposed method are expected to improve the performance of other language model-based link prediction models. The evaluation results for link prediction demonstrate that our proposed method outperforms the baseline model on various datasets, including Korean ConceptNet, WN18RR, FB15k-237, and YAGO3-10. Specifically, we observed improvements of 0.34%p, 0.11%p, 0.12%p, and 0.41%p in terms of Mean Reciprocal Rank (MRR), respecitvely.

Expected Addressee and Target Utterance Prediction for Construction of Multi-Party Dialogue Systems

Yoonjin Jang, Keunha Kim, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.10.918

As the number of communication channels between people has increased in recent years, there has been a rise in both multi-party conversations and one-to-one conversations. Research on analyzing multi-party conversations has also been active. In the past, models for analyzing such dialogues typically predicted the addressee of the final response based on the previous responses. However, this differs from the task of generating multi-party dialogue responses, which requires the speaker to select the addressee to whom they will respond. In this paper, we propose a new task for predicting the addressee of a multi-party dialogue that does not rely on response information. Our task aims to predict and match the expected target utterance with the expected addressee in a real multi-party dialogue. To accomplish this, we introduce a model that uses a transform encoder-based masked token prediction learning method. This model predicts the expected target utterance and the expected addressee of the current speaker based on the previous dialogue context, without considering the final response. The proposed model achieves an accuracy of 82% in predicting the expected recipient and 68% in predicting the expected target utterance accuracy on the Ubuntu IRC dataset. These results demonstrate the potential of our model for use in a multi-party dialogue system, as it can accurately predict the target utterance that should be used. Moving forward, we plan to expand our research by creating additional datasets for multi-party dialogues and applying them to real-world multilateral dialogue response generation systems.

A Study on Sales Prediction Model Based on BiLSTM-GAT Using Credit Card Transaction Data

Wonseok Jung, Dohyung Kim, Young Ik Eom

http://doi.org/10.5626/JOK.2024.51.9.807

Sales prediction using credit card transaction data is essential for understanding consumer buying patterns and market trends. However, traditional statistical and machine learning models have limitations when it comes to analyzing temporal features and the relationships between different variables, such as geographical data and sales information by service types, population, and transaction times. This paper proposes two models that can simultaneously analyze the relationships based on commercial district features and sales time-series features. To evaluate the performance of these models, we constructed graphs based on the distances and sales similarity of features between commercial districts. We then compared the performance of the proposed models with traditional time-series models, namely LSTM and BiLSTM. The results of the experiment showed that the GAT-BiLSTM model improved prediction accuracy by approximately 15% compared to the BiLSTM model, while the BiLSTM-GAT model improved it by about 29% over the BiLSTM model, as measured by RMSE.

Predicting of the Number of Diners in School Cafeteria; Including COVID-19 Pandemic Period Data

Chae-eun Baek, Yesl Kwon, Jangmin Oh

http://doi.org/10.5626/JOK.2024.51.7.634

Accurately predicting the number of diners in institutional food service is essential for efficient operations, reducing leftovers, and ensuring customer satisfaction. University cafeterias, in particular, face additional challenges in making these predictions due to various environmental factors and changes in class formats caused by the COVID-19 pandemic. To tackle this issue, this study utilized specialized data collected during the pandemic period in university cafeteria environments. The data was used to train and compare the performance of five different models. The three best-performing ensemble tree-based models -- RandomForest, LightGBM, and XGBoost -- were averaged to obtain a final prediction with a Mean Absolute Error (MAE) of 30.96. By regularly providing prediction results to on-campus cafeterias using this final model, practical support can be offered to optimize operations. This study presents an effective methodology for accurately predicting of the number of diners, even in abnormal situations such as the COVID-19 pandemic.

Improving Prediction of Chronic Hepatitis B Treatment Response Using Molecular Embedding

Jihyeon Song, Soon Sun Kim, Ji Eun Han, Hyo Jung Cho, Jae Youn Cheong, Charmgil Hong

http://doi.org/10.5626/JOK.2024.51.7.627

Chronic hepatitis B patients with no timely treatment are at a high risk of developing complications such as liver cirrhosis and hepatocellular carcinoma (liver cancer). As a result, various antiviral agents for hepatitis B have been developed, and due to the different components of these antiviral agents, there can be variations in treatment responses among patients. Therefore, selecting the appropriate medication that leads to a favorable treatment response is considered crucial. In this study, in addition to the patient's blood test results and electronic medical records indicating drug prescriptions, information about components of the hepatitis B antiviral agents was incorporated for learning. The aim was to enhance the prediction performance of treatment responses one year after chronic hepatitis B patients' treatment. Molecular embedding of the antiviral agents included both fixed molecular embedding and those generated through an end-to-end structure utilizing a graph neural network model. By comparing with the baseline model, drug molecule embedding was confirmed to contribute to improving performance.

Prediction of Cancer Prognosis Using Patient-Specific Cancer Driver Gene Information

Dohee Lee, Jaegyoon Ahn

http://doi.org/10.5626/JOK.2024.51.6.574

Accurate prediction of cancer prognosis is crucial for effective treatment. Consequently, numerous studies on cancer prognosis have been conducted, with recent research leveraging various machine learning techniques such as deep learning. In this paper, we first constructed patient-specific gene networks for each patient, then selected patient-specific cancer driver genes, considering the heterogeneity of cancer. We propose a deep neural architecture that can predict the prognosis more accurately using patient-specific cancer driver gene information. When our method was applied to gene expression data for 11 types of cancer, it demonstrated a significantly higher prediction accuracy compared to the existing methods.

Graph Structure Learning-Based Neural Network for ETF Price Movement Prediction

Hyeonsoo Jo, Jin-gee Kim, Taehun Kim, Kijung Shin

http://doi.org/10.5626/JOK.2024.51.5.473

Exchange-Traded Funds (ETFs) are index funds that mirror particular market indices, usually associated with their low risk and expense ratio to individual investors. Various methods have emerged for accurately predicting ETF price movements, and recently, AI-based technologies have been developed. One representative method involves using time-series-based neural networks to predict the price movement of ETFs. This approach effectively incorporates past price information of ETFs, allowing the prediction of their movement. However, it has a limitation as it only utilizes historical information of individual ETFs and does not account for the relationships and interactions between different ETFs. To address this issue, we propose a model that can capture relationships between ETFs. The proposed model uses graph structure learning to infer a graph representing relationships between ETFs. Based on this, a graph neural network predicts the ETF price movement. The proposed model demonstrates superior performance compared to time-series-based deep-learning models that only use individual ETF information.

Cross-Project Defect Prediction for Ansible Projects

Sungu Lee, Sunjae Kwon, Duksan Ryu, Jongmoon Baik

http://doi.org/10.5626/JOK.2024.51.3.229

Infrastructure-as-Code (IaC) refers to the activities of automating overall management through code, such as creating and deploying infrastructure. Infrastructure-as-Code is used by many companies due to its efficiency, and many within-project defect prediction techniques have been proposed targeting Ansible, one of the IaC tools. Recently, a study on the applicability of Ansible"s cross-project defect prediction has been proposed. Therefore, Ansible’s cross-project defect prediction technique was used in this study, and its effectiveness was analyzed. Experimental results showed that the performance of the F1-based cross-project defect prediction was measured to be 0.3 to 0.5, and that it could be used as an alternative to the internal project defect prediction technique. It is therefore anticipated that this will be put to use in support of Ansible’s software quality assurance activities.

Machine Learning-Based Approach for Predicting Drug-Induced Liver Injury of Chemical Compounds

Soyeon Lee, Sunyong Yoo

http://doi.org/10.5626/JOK.2023.50.9.777

Drug-induced liver injury (DILI) is one of the factors constraining the distribution of investigational products on the market. Therefore, DILI risk of compounds should be assessed in advance. Although in vivo and in vitro methods can be used to test drug safety, both methods are labor-intensive, time consuming and expensive. In this study, we suggested random forest, light gradient boosting machine, logistic regression models to overcome the above problems. These models used molecular structure and physicochemical features as input to predict the DILI as output. The optimal model was random forest, which performed well for evaluation metrics overall. The proposed model is expected to help drug development process by identifying potential DILI of drug candidates in advance.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Digital Library[ Search Result ]

Search

Editorial Office