Vol. 50, No. 12,
Dec. 2023
Digital Library
A Comparative Analysis of the Motion Recognition Rate by Direction of Push-up Activity Using ELM Algorithm
Sangwoong Kim, Jaeyeong Ryu, Jiwoo Jeong, Dongyeong Kim, Youngho Chai
http://doi.org/10.5626/JOK.2023.50.12.1031
In this paper, we propose a motion recognition system for each direction of push-up activity using ELM algorithm. In the proposed system, a recognized motion consists of three parts. The first part is the process of reading motion data. In the process, the data acquired from the motion capture system is entered into the system"s memory. Then, the system extracts a feature vector from the motion data. The 3D position data converted from the quaternion data value of the motion data is projected onto the X-Y plane, Y-Z plane and Z-X plane of the system, and the values are used as the final feature vector. Feature vectors projected on each plane train different ELM, and a total of three ELM are learned. Finally, by inputting test data to each learned ELM, the final recognition result value is derived. First, before obtaining motion data, as the data set to be trained, general push-ups performed in the correct posture were selected. Second, the upper chest did not go down all the way. Third, only the buttocks came up when bending and lifting. Four, when bending your elbows move away from your upper chest. Finally, mix these motions to build a test dataset.
Dovetail Usage Prediction Model for Resource-Efficient Virtual Machine Placement in Cloud Computing Environment
Hyeongbin Kang, Hyeon-Jin Yu, Jungbin Kim, Heeseok Jeong, Jae-Hyuck Shin, Seo-Young Noh
http://doi.org/10.5626/JOK.2023.50.12.1041
As IT services have migrated to the cloud, efficient resource management in cloud computing environments has become an important issue. Consequently, research has been conducted on virtual machine placement(VMP), which can increase resource efficiency without the need for additional equipment in data centers. This paper proposes the use of a usage prediction model as a method for selecting and deploying hosts suitable for virtual machine placement. The dovetail usage prediction model, which improves the shortcomings of the existing usage prediction models, measures indicators such as CPU, disk, and memory usage of virtual machines running on hosts and extracts features using a deep learning model by converting them into time series data. By utilizing this approach in virtual machine placement, hosts can be used efficiently while ensuring appropriate load balancing of the virtual machines.
Performance Analysis of Instruction Priority Functions using a List Scheduling Simulator
Changhoon Chung, Soo-Mook Moon
http://doi.org/10.5626/JOK.2023.50.12.1048
Instruction scheduling is an important compiler optimization technique, for reducing the execution time of a program by parallel processing. However, existing scheduling techniques show limited performance, because they rely on heuristics. This study examines the effect of instruction priority functions on list scheduling, through simulation. As a result, using a priority function based on the overall structure of the dependency graph can reduce schedule length by up to 4%, compared to using a priority function based on the original instruction order. Furthermore, the result gives a direction on which input features should be used when implementing a reinforcement learning-based scheduling model.
Predicting Significant Blood Marker Values for Pressure Ulcer Forecasting Utilizing Feature Minimization and Selection
Yeonhee Kim, Hoyoul Jung, Jang-Hwan Choi
http://doi.org/10.5626/JOK.2023.50.12.1054
Pressure ulcers are difficult to treat once they occur, and huge economic costs are incurred during the treatment process. Therefore, predicting the occurrence of pressure ulcers is important in terms of patient suffering and economics. In this study, the correlation between the lab codes (features) and pressure ulcers obtained from blood tests of patients with spinal cord injury was analyzed to provide meaningful characteristic information for the prediction of pressure ulcers. We compare and analyze the correlation coefficients of Pearson, Spearman, and Kendall"s tau, which are mainly used in feature selection methods. In addition, the importance of features is calculated using XGBoost and LightGBM, which are machine learning methods based on gradient boosting. In order to verify the performance of this model, we use the long short-term memory (LSTM) model to predict other features using the features occupying the top-5 in importance. In this way, unnecessary features can be minimized in diagnosing pressure ulcers and guidelines can be provided to medical personnel.
Gender Classification Model Based on Colloquial Text in Korean for Author Profiling of Messenger Data
Jihye Kang, Minho Kim, Hyuk-Chul Kwon
http://doi.org/10.5626/JOK.2023.50.12.1063
With explosive social network services (SNS) growth, there has been an extensive generation of text data through messenger services. In addition, various applications such as Sentiment Analysis, Abusive text Detection, and Chatbot have been developed and provided due to the recent development of Natural Language Processing. However, there has not been an attempt to classify various characteristics of authors such as the gender and age of speakers in Korean colloquial texts. In this study, I propose a gender classification model for author profiling using Korean colloquial texts. Based on Kakao Talk data for the gender classification of the speaker, the Domain Adaptation is carried out by additionally learning ‘Nate Pan’ data to KcBERT(Korean Comments BERT) which is learned by Korean comments. Results of experimenting with a model that combines External Lexical Information showed that the performance was improved by achieving an accuracy of approximately 95%. In this study, the self-collected ‘Nate Pan’ data and the "daily conversation" data provided by the National Institute of the Korean Language were used for domain adaptation, and the ‘Korean SNS’ data of AI HUB was used for model learning and evaluation.
Type-Checking-based Refinement for the Analysis of Uncaught Exceptions in Digital Forensic Software
Seowoo Lee, Dongwon Lee, Sehoon Kim
http://doi.org/10.5626/JOK.2023.50.12.1071
This paper designs an uncaught exception detection scheme for digital forensic software written in Python, aiming at enhancing the reliability of the forensic process. Inherited from the legacy set-constraint-based analysis method, the proposed scheme identifies potential uncaught exceptions in the target forensic software. Next, with the help of Pyright, a Python-specific static type checker, it is possible to eliminate meaningless alarms inevitably created during the analysis process, such as key errors in list types or out-of-range index errors in dictionary types. In addition, we remove duplicated detections based on the dependency tree which traces the inclusion relationship between each component or point of a given module. The experiment results, obtained by applying our static analyzer to nine benchmarks of digital forensic software, demonstrate that the proposed scheme successfully finds 10 locations of three exception patterns, including dictionary key errors, out-of-range index errors, and division by zero errors, which could not be located before. Furthermore, the analysis achieves an average of 84% and a maximum of 89% reduction in false alarms for each benchmark.
Exploring Neural Network Models for Road Classification in Personal Mobility Assistants: A Comparative Study on Accuracy and Computational Efficiency
Gwanghee Lee, Sangjun Moon, Kyoungson Jhang
http://doi.org/10.5626/JOK.2023.50.12.1083
With the increasing use of personal mobility devices, the frequency of traffic accidents has also risen, with most accidents resulting from collisions with cars or pedestrians. Notably, the compliance rate of the traffic rules on the roads is low. Auxiliary systems that recognize and provide information about roads could help reduce the number of accidents. Since road images have distinct material characteristics, models studied in the field of image classification are suitable for application. In this study, we compared the performance of various road image classification models with parameter counts ranging from 2 million to 30 million, enabling the selection of the appropriate model based on the situation. The majority of the models achieved an accuracy of over 95%, with most models surpassing 99% in the top-2 accuracy. Of the models, MobileNet v2 had the fewest parameters while still exhibiting excellent performance and EfficientNet had stable accuracy across all classes, surpassing 90% accuracy.
Improving Counterexample-Guided Bidirectional Inductive Synthesis by an Incremental Approach
Yongho Yoon, Woosuk Lee, Kwangkeun Yi
http://doi.org/10.5626/JOK.2023.50.12.1091
One of the sources of inefficiency in counterexample-guided inductive synthesis algorithms is the fresh restart of inductive synthesis for each iteration. In this paper, we propose an incremental approach for the generalized counterexample-guided bidirectional inductive synthesis algorithm. The incremental algorithm reuses knowledge from the last iteration therefore reducing the search space, and making the remaining search faster. We applied our approach to the state-of-the-art bidirectional inductive synthesis algorithm, Simba, which is based on iterative forward-backward abstract interpretation. We implemented our approach and evaluated it on a set of benchmarks from the Simba paper. The experimental results showed that, on average, our approach reduces synthesis time to 74.2% of the original, without any loss in the quality.
Robust Korean Table Machine Reading Comprehension across Various Domains
Sanghyun Cho, Hye-Lynn Kim, Hyuk-chul Kwon
http://doi.org/10.5626/JOK.2023.50.12.1102
Unlike regular text data, tabular data has structural features that allow it to represent compressed information. This has led to their use in a variety of domains, and machine reading comprehension of tables has become an increasingly important aspect of Machine Reading Comprehension(MRC). However, the structure of tables and the knowledge required for each domain are different, and when a language model is trained for a single domain, the evaluation performance of the model in other domains is likely to be reduced, resulting in poor generalization performance. To overcome this, it is important to build datasets of various domains and apply various techniques rather than simply pre-trained models. In this study, we design a language model that learns cross-domain invariant linguistic features to improve domain generalization performance. We applied adversarial training to improve performance on evaluation datasets in each domain and modify the structure of the model by adding an embedding layer and a transformer layer specialized for tabular data. When applying adversarial learning, we found that the model with a structure that does not add table-specific embeddings improves performance. On the other hand, while adding a table-specific transformer layer and having the added layer receive additional table-specific embeddings as input, shows the best performance on data from all domains.
A Proposal for Lightweight Human Action Recognition Model with Video Frame Selection for Residential Area
http://doi.org/10.5626/JOK.2023.50.12.1111
Residential area closed-circuit televisions (CCTVs) need human action recognition (HAR) to predict any accidents and crucial problems. HAR model must be not only accurate but also light and fast to apply in the real world. Therefore, in this paper, a cross-modal PoseC3D model with a frame selection method is proposed. The proposed cross-modal PoseC3D model integrates multi-modality inputs (i.e., RGB image and human skeleton data) and trains them in a single model. Thus, the proposed model is lighter and faster than previous works such as two-pathway PoseC3D. Moreover, we apply the frame selection method to use only the meaningful frames based on differences between frames instead of using the whole frame of a video. AI Hub open dataset was used to verify the performance of proposed method. The experimental results showed that the proposed method achieves similar or better performance and is much lighter and faster than those in the previous works.
Hierarchical Latent Representation-based Framework for Automatic Detection of Cybercrime Slang
http://doi.org/10.5626/JOK.2023.50.12.1121
Cybercriminals constantly produce and use slang by adding criminal meanings to existing words or replacing them with similar words for communication. Continuous monitoring and manual work are required to respond to this, and a large amount of labeled training data is required when using deep learning. However, the ability to collect a large amount of training data is limited because direct labeling by a person requires a lot of time and money and proceeds secretly due to the nature of cybercrime. Thus, we develop a framework based on an autoencoder and propose a method to effectively detect contextual cybercrime slang and neologisms through hierarchical latent vector similarity comparisons to address these limitations. Experiments using a cybercrime post dataset showed that the framework had an accuracy of up to 99.1% at a similarity threshold of 0.5.
Improvement of Machine Learning-Based Event-Related Desynchronization Accuracy
http://doi.org/10.5626/JOK.2023.50.12.1131
The biometrics field is known for providing fast and accurate identity verification. Recently, motor imagery (MI) brainwaves have gained prominence, accompanied by event-related desynchronization (ERD) signals. The purpose of this study is to optimize existing ERD models to enhance inter-user classification accuracy. We used a well-known common spatial pattern (CSP) and ERD as representative features for MI, and classified them using naïve bayes (NB). To evaluate the reliability of the binary classification results of the SVM, equal error rate (EER) and area under the curve (AUC) were used. The proposed ERD model exhibited superior accuracy compared to CSP and traditional ERD, achieving classification accuracies of 86.4%, 86.3%, and 63%, respectively. Based on this results, the proposed ERD method is presented as a suitable future biometric marker.
Prediction of Toothbrushing Position Based on Gyro Sensor Data and its Validation Using Unsupervised Learning-based Clustering
DoYoon Kim, MinWook Kwon, SeungJu Baek, HyeRin Yoon, DaeYeon Lim, Eunah Jo, Seungjae Ryu, Young Wook Kim, Jin Hyun Kim
http://doi.org/10.5626/JOK.2023.50.12.1143
Oral health is an important health indicator that is directly related to longevity. For this reason, oral health has become a key component of public health, from infants to the elderly. The foundation of good oral health is good brushing habits. However, the recommended correct brushing method is not easy to adopt, and this harms oral health. This paper proposes a method to distinguish brushing zones using low-cost IMU sensors to track the correct brushing method. We evaluated the accuracy of the brushing zone estimation method using clustering algorithms in machine learning. In this paper, we propose a method for determining the brushing area based on toothbrush posture alone using the gyro sensor of an IMU sensor. In this paper, we propose a method for determining the brushing area using only the gyro sensor of an IMU sensor based on toothbrush posture. We showed that relatively inexpensive 6-axis IMU gyro sensor data could be used to estimate the user’s brushing area with an accuracy of 80.6%. In addition, we applied a clustering algorithm to these data and trained a logistic regression model using the clustered data to estimate the brushing area. The result was obtained with an accuracy of 86.7%, showing that clustering was effective and that the toothbrush posture-based brushing area estimation proposed in this paper was effective. In conclusion, it is expected that the brushing zone estimation algorithm can be implemented as a function of a relatively low-cost toothbrush and that it can help to maintain oral health by analyzing and improving personal brushing habits.
Deep k-Means Node Clustering Based on Graph Neural Networks
http://doi.org/10.5626/JOK.2023.50.12.1153
Recently, graph node clustering techniques using graph neural networks (GNNs) have been actively studied. Notably, most of these studies use a GNN to embed each node into a low-dimensional vector and then cluster the embedding vectors using the existing clustering algorithms. However, since this approach does not consider the final goal of clustering when training the GNN, it is difficult to say that it produces optimal clustering results. Therefore, in this paper, we propose a deep k-means clustering method that iteratively trains a GNN considering the final goal of k-means clustering and performs k-means clustering on the embedding vectors generated by the trained GNN. The proposed method considers both the similarity between nodes and the loss of k-means clustering when training a GNN. Experimental results using real datasets confirmed that the proposed method improves the quality of k-means clustering results compared to the existing methods.
Measuring Anonymized Data Utility through Correlation Indicator
Yongki Hong, Gihyuk Ko, Heedong Yang, Chanho Ryu, Seung Hwan Ryu
http://doi.org/10.5626/JOK.2023.50.12.1163
As we transition into an artificial intelligence-driven society, data collection and utilization are actively progressing. Consequently, currently there are emerging technologies and privacy models to convert original data into anonymized data, while ensuring it does not violate privacy guidelines. Notably, privacy models including k-anonymity, l-diversity, and t-closeness are actively being used. Depending on the purpose of the data, the situation, and the degree of privacy, it"s crucial to choose the appropriate models and parameters. Ideally, the best scenario would be maximizing data utility while meeting privacy conditions. This process is called Privacy-Preserving Data Publishing (PPDP). To derive this ideal scenario, it is essential to consider both utility and privacy indicators. This paper introduces a new utility indicator, the Effect Size Average Cost, which can assist privacy administrators to efficiently create anonymized data. This indicator pertains to the correlation change between quasi-identifiers and sensitive attributes. In this study, we conducted experiments to compute and compare this indicator with tables where k-anonymity, l-diversity, and t-closeness were applied respectively. The results identified significant differences in the Effect Size Average Costs for each case, indicating the potential of this indicator as a valid basis for determining which privacy model to adopt.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr