Digital Library[ Search Result ]
Developing a Testability Prediction Model for High Complexity Software using Regression Analysis
http://doi.org/10.5626/JOK.2023.50.2.162
Testability is the degree to which the software supports testing in a given test context. Early prediction of testability can help developers identify software components that require a lot of effort to ensure software quality, plan testing activities, and recognize the need for refactoring to reduce testing effort. Existing studies have been conducted to predict testability by performing regression analysis using software metrics and code coverage. These studies used training data with a large proportion of simple software structures. However, prediction models trained with imbalanced data with a large proportion of simple structures may have low testability prediction accuracy of high complexity software. We used the training data generated based on the metric acceptance criteria of industry domain standards to build a prediction model considering high complexity software. As a result of building a testability prediction model using three regression analyses, we construct a predictive model with a branch coverage error of about 4.4% and a coefficient of determination of 0.86.
Dovetail Usage Prediction Model for Resource-Efficient Virtual Machine Placement in Cloud Computing Environment
Hyeongbin Kang, Hyeon-Jin Yu, Jungbin Kim, Heeseok Jeong, Jae-Hyuck Shin, Seo-Young Noh
http://doi.org/10.5626/JOK.2023.50.12.1041
As IT services have migrated to the cloud, efficient resource management in cloud computing environments has become an important issue. Consequently, research has been conducted on virtual machine placement(VMP), which can increase resource efficiency without the need for additional equipment in data centers. This paper proposes the use of a usage prediction model as a method for selecting and deploying hosts suitable for virtual machine placement. The dovetail usage prediction model, which improves the shortcomings of the existing usage prediction models, measures indicators such as CPU, disk, and memory usage of virtual machines running on hosts and extracts features using a deep learning model by converting them into time series data. By utilizing this approach in virtual machine placement, hosts can be used efficiently while ensuring appropriate load balancing of the virtual machines.
Applying Multitopic Analysis of Bug Reports and CNN algorithm to Bug Severity Prediction
Eontae Kim, Geunseok Yang, Inhong Jung
http://doi.org/10.5626/JOK.2023.50.11.954
Bugs are common in software development. Depending on the severity of bugs, they can be classified as major errors and minor errors. In addition, the severity of the bug can be selected by the bug reporter. However, the bug reporter could apply subjective judgment, which can lead to errors in the severity judgment. To resolve this problem, in this study, we predict the bug severity by applying topic-based Severe and Non-Severe extraction with convolutional neural network (CNN) learning. First, by using the properties of the bug report, is the predicting process is divided into Global topic, Product topic, Component topic and Priority topic and the bug reports are extracted from each topic based on Severe and Non-Severe. The Severe and Non-Severe features are extracted from the Global topics, and severity features are extracted from the Product, Component and Priority topics in the same way. The extracted features are combined, put into the CNN algorithm as an input layer, and the model is trained. To evaluate the efficiency of our model, a comparison between the proposed model and the baselines were conducted in the Eclipse, Mozilla, Apache and KDE open-source projects. Our model showed an improved performance. The results showed 97% for Eclipse, 96% for Mozilla, 95% for Apache and 99% for KDE, showing an average performance improvement of about 24.59% compared to the baseline, and a statistically significant difference.
Improvement Study on Active Learning-based Cross-Project Defect Prediction System
http://doi.org/10.5626/JOK.2023.50.11.931
This study proposes a practical improvement method for an active learning-based system for cross-project defect prediction. A previous study applied active learning tech- niques to practically improve the performance of cross-project defect prediction, but it used a traditional machine learning model that used hand-made features as input for active learning target selection and defect prediction, therefore feature extraction was expensive and performance was limited. In addition, the problem of performance deviation according to the selection of the input project remained. In this study, the following methods were proposed to overcome these limitations. First, we used a deep learning model that can use the source code as an input to lower the model building cost and improve prediction performance. Second, a Bayesian convolutional neural network is applied to select an active learning target using a deep learning model. Third, instead of considering a single source project, we applied a method that automatically extracts a training data set from multiple projects. Applying the system proposed in this study to 7 open source projects improved the average prediction performance by 13.58% compared to the previous latest research.
Machine Learning-Based Approach for Predicting Drug-Induced Liver Injury of Chemical Compounds
http://doi.org/10.5626/JOK.2023.50.9.777
Drug-induced liver injury (DILI) is one of the factors constraining the distribution of investigational products on the market. Therefore, DILI risk of compounds should be assessed in advance. Although in vivo and in vitro methods can be used to test drug safety, both methods are labor-intensive, time consuming and expensive. In this study, we suggested random forest, light gradient boosting machine, logistic regression models to overcome the above problems. These models used molecular structure and physicochemical features as input to predict the DILI as output. The optimal model was random forest, which performed well for evaluation metrics overall. The proposed model is expected to help drug development process by identifying potential DILI of drug candidates in advance.
Approach for Learning Intention Prediction Model based on Recurrent Neural Network
Sung-hyuk Bang, Seok-Hyun Bae, Hyun-Kyu Park, Myung-Joong Jeon, Je-Min Kim, Young-Tack Park
http://doi.org/10.5626/JOK.2018.45.4.360
Several studies have been conducted on human intention prediction with the help of machine learning models. However, these studies have indicated a fundamental shortcoming of machine learning models since they are unable to reflect a long span of past information. To overcome this limitation, this paper proposes a human intention prediction model based on a recurrent neural network(RNN). For performing predictions, the RNN model classifies the patterns of time-series data by reflecting previous sequence patterns of the time-series data. For performing intention prediction using the proposed model, an RNN model was trained to classify predefined intentions by using attributes such as time, location, activity and detected objects in a house. Each RNN node is composed of a long short-term memory cell to solve the long term dependency problem. To evaluate the proposed intention prediction model, a data generator based on the weighted-graph structure has been developed for generating data on a daily basis. By incorporating 23,000 data instances for training and testing the proposed intention prediction model, a prediction accuracy value of 90.52% was achieved.
A Feature Selection Technique in the Neural Network for Demand Forecasting of Mobile Payment System
Ho-Joon Kim, Yun-Seok Cho, Kyungmi Kim
http://doi.org/10.5626/JOK.2018.45.4.370
In this paper, we present a time series prediction technique based on neural network as a methodology for forecasting service demand of mobile payment system. We propose a two-stage neural network model for the feature selection process and the prediction process. Three types of fuzzy membership functions were adopted for the representation of feature data, and a hyperbox-based neural network model is used for the evaluation of feature relevance factor. The proposed feature selection technique reduces the amount of computation and eliminates erroneous feature data in the learning data set. We evaluated the usefulness of the proposed method through experiments using two years of data obtained form actual smart campus systems.
Identification of Heterogeneous Prognostic Genes and Prediction of Cancer Outcome using PageRank
http://doi.org/10.5626/JOK.2018.45.1.61
The identification of genes that contribute to the prediction of prognosis in patients with cancer is one of the challenges in providing appropriate therapies. To find the prognostic genes, several classification models using gene expression data have been proposed. However, the prediction accuracy of cancer prognosis is limited due to the heterogeneity of cancer. In this paper, we integrate microarray data with biological network data using a modified PageRank algorithm to identify prognostic genes. We also predict the prognosis of patients with 6 cancer types (including breast carcinoma) using the K-Nearest Neighbor algorithm. Before we apply the modified PageRank, we separate samples by K-Means clustering to address the heterogeneity of cancer. The proposed algorithm showed better performance than traditional algorithms for prognosis. We were also able to identify cluster-specific biological processes using GO enrichment analysis.
A Model for Nowcasting Commodity Price based on Social Media Data
(Jaewoo Kim, Meeyoung Cha, Jong Gun Lee
http://doi.org/10.5626/JOK.2017.44.12.1258
Capturing real-time daily information on food prices is invaluable to help policymakers and development organizations address food security problems and improve public welfare. This study analyses the possible use of large-scale online data, available due to growing Internet connectivity in developing countries, to provide updates on food security landscape. We conduct a case study of Indonesia to develop a time-series prediction model that nowcasts daily food prices for four types of food commodities that are essential in the region: beef, chicken, onion and chilli. By using Twitter price quotes, we demonstrate the capability of social data to function as an affordable and efficient proxy for traditional offline price statistics.
Group Emotion Prediction System based on Modular Bayesian Networks
http://doi.org/10.5626/JOK.2017.44.11.1149
Recently, with the development of communication technology, it has become possible to collect various sensor data that indicate the environmental stimuli within a space. In this paper, we propose a group emotion prediction system using a modular Bayesian network that was designed considering the psychological impact of environmental stimuli. A Bayesian network can compensate for the uncertain and incomplete characteristics of the sensor data by the probabilistic consideration of the evidence for reasoning. Also, modularizing the Bayesian network has enabled flexible response and efficient reasoning of environmental stimulus fluctuations within the space. To verify the performance of the system, we predict public emotion based on the brightness, volume, temperature, humidity, color temperature, sound, smell, and group emotion data collected in a kindergarten. Experimental results show that the accuracy of the proposed method is 85% greater than that of other classification methods. Using quantitative and qualitative analyses, we explore the possibilities and limitations of probabilistic methodology for predicting group emotion.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr