Journal of KIISE

Search : [ keyword: 클러스터링 ] (25)

Prediction of Toothbrushing Position Based on Gyro Sensor Data and its Validation Using Unsupervised Learning-based Clustering

DoYoon Kim, MinWook Kwon, SeungJu Baek, HyeRin Yoon, DaeYeon Lim, Eunah Jo, Seungjae Ryu, Young Wook Kim, Jin Hyun Kim

http://doi.org/10.5626/JOK.2023.50.12.1143

Oral health is an important health indicator that is directly related to longevity. For this reason, oral health has become a key component of public health, from infants to the elderly. The foundation of good oral health is good brushing habits. However, the recommended correct brushing method is not easy to adopt, and this harms oral health. This paper proposes a method to distinguish brushing zones using low-cost IMU sensors to track the correct brushing method. We evaluated the accuracy of the brushing zone estimation method using clustering algorithms in machine learning. In this paper, we propose a method for determining the brushing area based on toothbrush posture alone using the gyro sensor of an IMU sensor. In this paper, we propose a method for determining the brushing area using only the gyro sensor of an IMU sensor based on toothbrush posture. We showed that relatively inexpensive 6-axis IMU gyro sensor data could be used to estimate the user’s brushing area with an accuracy of 80.6%. In addition, we applied a clustering algorithm to these data and trained a logistic regression model using the clustered data to estimate the brushing area. The result was obtained with an accuracy of 86.7%, showing that clustering was effective and that the toothbrush posture-based brushing area estimation proposed in this paper was effective. In conclusion, it is expected that the brushing zone estimation algorithm can be implemented as a function of a relatively low-cost toothbrush and that it can help to maintain oral health by analyzing and improving personal brushing habits.

Deep k-Means Node Clustering Based on Graph Neural Networks

Hyesoo Shin, Ki Yong Lee

http://doi.org/10.5626/JOK.2023.50.12.1153

Recently, graph node clustering techniques using graph neural networks (GNNs) have been actively studied. Notably, most of these studies use a GNN to embed each node into a low-dimensional vector and then cluster the embedding vectors using the existing clustering algorithms. However, since this approach does not consider the final goal of clustering when training the GNN, it is difficult to say that it produces optimal clustering results. Therefore, in this paper, we propose a deep k-means clustering method that iteratively trains a GNN considering the final goal of k-means clustering and performs k-means clustering on the embedding vectors generated by the trained GNN. The proposed method considers both the similarity between nodes and the loss of k-means clustering when training a GNN. Experimental results using real datasets confirmed that the proposed method improves the quality of k-means clustering results compared to the existing methods.

A Time-Course Multi-Clustering Method for Single-Cell Trajectory Inference

Jaeyeon Jang, Inuk Jung

http://doi.org/10.5626/JOK.2022.49.10.838

From time-series single-cell transcriptome data, gene expression information can be generated to observe the timing of significant cell differentiation changes while accounting for important biological phenomena in relation to experimental conditions. Due to recent surge of time-series single-cell transcriptome data, studies on various dynamic variation in cells such as cell cycle and cell differentiation have been actively conducted. Particularly, time series analysis at single-cell level for cell differentiation is advantageous for biological interpretation compared to a single time point as it is possible to observe changes in the time axis. In this paper, we proposed a multi-clustering method to infer cell trajectory by considering time information at the genetic-level of time-series single-cell transcriptome data. Analyses of gene expression data on the development of human neuron cell differentiation using this method showed similar results to biological results uncovered in a previous study.

SVD-based Cross-Domain Recommendation Using K-means Clustering

Tae-Hoon Kim, Sung Kwon Kim

http://doi.org/10.5626/JOK.2022.49.5.360

Cross-domain recommendation is a method that shares related user information data and item data in different domains. It is mainly used in online shopping malls with many users or multimedia service contents, such as YouTube or Netflix. Through K-means Clustering, embeddings are created by performing clustering based on user data and ratings. After learning the result through a multi-layer neural network, user satisfaction is predicted. Then, items suitable for the user are recommended using matrix factorization, which is a collaborative filtering technique. Through this study, it was shown through experiments that recommendations can predict cold-start problems at a lesser time cost and increase the user satisfaction.

Ensemble of Sentence Interaction and Graph Based Models for Document Pair Similarity Estimation

Seonghwan Choi, Donghyun Son, Hochang Lee

http://doi.org/10.5626/JOK.2021.48.11.1184

Deriving the similarity between two documents, such as, news articles, is one of the most important factors of clustering documents. Sequence similarity models, one of the existing deep-learning based approaches to document clustering, do not reflect the entire context of documents. To address this issue, this paper uses interaction-based and graph-based approaches to construct document pair similarity models suitable for news clustering. This paper proposes four interaction-based models that measures the similarity between two documents through the aggregation of similarity information in the interaction of sentences. The experimental results demonstrated that two out of these four proposed models outperformed SVM and HAN. Ablation studies were conducted on the graph-based model through experiments on the depth of the model’s neural network and its input features. Through error analysis and ensemble of models with an interaction and graph-based approach, this paper showed that these two approaches could be complementarity due to the differences in their prediction tendencies.

An Efficient Document Clustering Method using Space Transformation based on LDA and WMD

Yongdam Kim, Sungwon Jung

http://doi.org/10.5626/JOK.2021.48.9.1052

The existing TF-IDF-based document clustering methods do not properly exploit the contextual information of documents, i.e., co-occurence and word-order, and tend to degrade the performance due to the curse of dimensionality. To overcome these problems, the techniques such as a weighted average of word embedding vectors or Word Mover"s Distance (WMD) have been proposed. The performance of the techniques is good at document classification, but not a document clustering that needs to group documents. In this study, we define a document group as a topic document using LDA, the document group"s representative document, and solve the existing problem by calculating the WMD based on the topic document. However, since WMD requires a large amount of computation, we propose a space transformation method that shows a good performance while reducing the computation cost by mapping each document to a low-dimensional space in which each axis means WMD value from each topic document.

Incorrect Triple Detection Using Knowledge Graph Embedding and Adaptive Clustering

Won-Chul Shin, Jea-Seung Roh, Young-Tack Park

http://doi.org/10.5626/JOK.2020.47.10.958

Recently, with the increase in the amount of information from the development of the Internet, research using large-capacity knowledge graphs is being actively conducted. Additionally, as knowledge graphs are used for various research and services, there is a need to secure quality knowledge graphs. However, there is a lack of research to detect errors within the knowledge graphs to obtain quality knowledge graphs. Previous studies using the embedding and clustering for error triple detection showed good performance. However, in the process of the cluster optimization, there was a problem that the characteristics of each cluster could not be factored using the same threshold collectively. In this paper, to resolve these problems, we propose an adaptive clustering model in which clustering is conducted by finding and applying the optimum threshold for each cluster with the embedding for knowledge graph for error triple detection in the knowledge graph. To evaluate the performance of the method proposed in this paper, the existing error triple detection studies and comparative experiments were conducted on three datasets, DBpeida, Frebase and WiseKB, and the high performance was confirmed by an average of 5.3% based on the F1-Score.

An Efficient and Differentially Private K-Means Clustering Algorithm Using the Voronoi Diagram

Daeyoung Hong, Kyuseok Shim

http://doi.org/10.5626/JOK.2020.47.9.879

Studies have been recently conducted on preventing the leakage of personal information from the analysis results of data. Among them, differential privacy is a widely studied standard since it guarantees rigorous and provable privacy preservation. In this paper, we propose an algorithm based on the Voronoi diagram to publish the results of the K-means clustering for 2D data while guaranteeing the differential privacy. Existing algorithms have a disadvantage in that it is difficult to select the number of samples for the data since the running time and the accuracy of the clustering results may change according to the number of samples. The proposed algorithm, however, could quickly provide an accurate clustering result without requiring such a parameter. We also demonstrate the performance of the proposed algorithm through experiments using real-life data.

Cascading Behavior and Information Diffusion in Overlapping Clusters

Woojung Lee, Joyce Jiyoung Whang

http://doi.org/10.5626/JOK.2020.47.4.422

Information diffusion models formulate and explain cascading behavior in networks where a small set of initial adopters is assumed to acquire new information and the new information is propagated to the other nodes in the network. Most existing information diffusion models assume that a node in a network belongs to only one cluster, and based on this assumption, it has been shown that clusters are obstacles to cascades. However, in many real-world networks, a node can belong to multiple clusters, i.e., clusters can overlap. In this paper, we study cascading behavior in a network when clusters overlap. We show that clusters are not obstacles to cascades if the initial adopters are placed in the overlapped region between the clusters or if we allow compatibility. We verify our theorems and models on four real-world datasets.

Anomaly Detection Analysis using Repository based on Inverted Index

Jumi Park, Weduke Cho, Kangseok Kim

http://doi.org/10.5626/JOK.2018.45.3.294

With the emergence of the new service industry due to the development of information and communication technology, cyber space risks such as personal information infringement and industrial confidentiality leakage have diversified, and the security problem has emerged as a critical issue. In this paper, we propose a behavior-based anomaly detection method that is suitable for real-time and large-volume data analysis technology. We show that the proposed detection method is superior to existing signature security countermeasures that are based on large-capacity user log data according to in-company personal information abuse and internal information leakage. As the proposed behavior-based anomaly detection method requires a technique for processing large amounts of data, a real-time search engine is used, called Elasticsearch, which is based on an inverted index. In addition, statistical based frequency analysis and preprocessing were performed for data analysis, and the DBSCAN algorithm, which is a density based clustering method, was applied to classify abnormal data with an example for easy analysis through visualization. Unlike the existing anomaly detection system, the proposed behavior-based anomaly detection technique is promising as it enables anomaly detection analysis without the need to set the threshold value separately, and was proposed from a statistical perspective.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Digital Library[ Search Result ]

Search

Editorial Office