Search : [ author: Hwan-Gue Cho ] (7)

Proposal of a Graph Based Chat Message Analysis Model for Messenger User Verification

Da-Young Lee, Hwan-Gue Cho

http://doi.org/10.5626/JOK.2022.49.9.696

As crimes and accidents through messengers increase, the necessity of verifying messenger users is emerging. In this study, two graph-based messenger user verification models that apply the traditional author verification problem to chat text were proposed. First, the graph random walk model builds an n-gram transition graph with a previous chat message and verifies the user by learning the characteristic of traversing the transition graph with a message whose author is unknown. The results showed an accuracy of 86% in 10,000 chat conversations. Second, the graph volume model verified the user using the characteristic that the size of the transition graph increased over time and achieved an accuracy of 87% in 1,000 chat conversations. When the density of the chat messages was calculated based on the transmission time, both graph models could guarantee more than 80% accuracy when the chat density was 15 or more.

A Cross-Texting Prevention System Using Syntactic Characteristics of Chat Messages

Da-Young Lee, Hwan-Gue Cho

http://doi.org/10.5626/JOK.2021.48.6.639

Cross-texting refers to accidentally sending a message to an unintended person. It occurs frequently when users chat with multiple counterparts at the same time. Messengers mainly provide a function of canceling sending, but it is only post solution, and users find it difficult to prevent mistakes in advance. In this paper, we proposed a cross-texting detection model by analyzing the syntactic characteristics of chat sentences. It modelizes the previous chat messages of a specific user by extracting the honorifics and completeness features from chat messages, and detects the cross-texting cases by determining whether the target sentences are in accordance with the user chat message model. This approach is significant as it solves the cross-texting detection problem only with syntactic characteristics without semantic analysis by modeling the consistency of the user"s chat attitude. The proposed model detect cross-texting cases with an accuracy of 85.5% from automatically generated data using a real messenger dialogue corpus.

Design of Photovoltaic Power Generation Prediction Model with Recurrent Neural Network

Hanho Kim, Haesung Tak, Hwan-gue Cho

http://doi.org/10.5626/JOK.2019.46.6.506

The Smart Grid predicts the power generation amount of renewable energy and enables efficient power generation and consumption. Existing PV power generation prediction studies have rarely applied and compared recurrent neural network techniques that are superior to time series. Furthermore, in the reported studies, there is no consideration of the length of past data used for learning, leading to lowered prediction performance of the model. In this study, we used the embedded variable selection techniques to find the factors influencing PV power generation. Subsequently, experiments were carried out to insert various past data length into the recurrent neural networks (RNN, LSTM, GRU). We found the optimal prediction factors and designed a prediction model based on the outcomes of the experiments. The designed PV power generation prediction model shows better prediction performance compared to other factor settings. In addition, better performance based on the prediction rate is confirmed in the present study as compared with the existing researches.

Keyword Network Visualization for Text Summarization and Comparative Analysis

Kyeong-rim Kim, Da-yeong Lee, Hwan-Gue Cho

http://doi.org/

Most of the information prevailing in the Internet space consists of textual information. So one of the main topics regarding the huge document analyses that are required in the “big data” era is the development of an automated understanding system for textual data; accordingly, the automation of the keyword extraction for text summarization and abstraction is a typical research problem. But the simple listing of a few keywords is insufficient to reveal the complex semantic structures of the general texts. In this paper, a text-visualization method that constructs a graph by computing the related degrees from the selected keywords of the target text is developed; therefore, two construction models that provide the edge relation are proposed for the computing of the relation degree among keywords, as follows: influence-interval model and word- distance model. The finally visualized graph from the keyword-derived edge relation is more flexible and useful for the display of the meaning structure of the target text; furthermore, this abstract graph enables a fast and easy understanding of the target text. The authors’ experiment showed that the proposed abstract-graph model is superior to the keyword list for the attainment of a semantic and comparitive understanding of text.

Detecting Road Intersections using Partially Similar Trajectories of Moving Objects

Bokuk Park, Jinkwan Park, Taeyong Kim, Hwan-Gue Cho

http://doi.org/

Automated road map generation poses significant research challenges since GPS-based navigation systems prevail in most general vehicles. This paper proposes an automated detecting method for intersection points using GPS vehicle trajectory data without any background digital map information. The proposed method exploits the fact that the trajectories are generally split into several branches at an intersection point. One problem in previous work on this intersection detecting is that those approaches require stopping points and direction changes for every testing vehicle. However our approach does not require such complex auxiliary information for intersection detecting. Our method is based on partial trajectory matching among trajectories since a set of incoming trajectories split other trajectory cluster branches at the intersection point. We tested our method on a real GPS data set with 1266 vehicles in Gangnam District, Seoul. Our experiment showed that the proposed method works well at some bigger intersection points in Gangnam. Our system scored 75% sensitivity and 78% specificity according to the test data. We believe that more GPS trajectory data would make our system more reliable and applicable in a practice.

An Efficient Clustering Algorithm for Massive GPS Trajectory Data

Taeyong Kim, Bokuk Park, Jinkwan Park, Hwan-Gue Cho

http://doi.org/

Digital road map generation is primarily based on artificial satellite photographing or in-site manual survey work. Therefore, these map generation procedures require a lot of time and a large budget to create and update road maps. Consequently, people have tried to develop automated map generation systems using GPS trajectory data sets obtained by public vehicles. A fundamental problem in this road generation procedure involves the extraction of representative trajectory such as main roads. Extracting a representative trajectory requires the base data set of piecewise line segments(GPS-trajectories), which have close starting and ending points. So, geometrically similar trajectories are selected for clustering before extracting one representative trajectory from among them. This paper proposes a new divide- and-conquer approach by partitioning the whole map region into regular grid sub-spaces. We then try to find similar trajectories by sweeping. Also, we applied the Fréchet distance measure to compute the similarity between a pair of trajectories. We conducted experiments using a set of real GPS data with more than 500 vehicle trajectories obtained from Gangnam-gu, Seoul. The experiment shows that our grid partitioning approach is fast and stable and can be used in real applications for vehicle trajectory clustering.

Multi-Level Sequence Alignment : An Adaptive Control Method Between Speed and Accuracy for Document Comparison

Jong-kyu Seo, Haesung Tak, Hwan-Gue Cho

http://doi.org/

Finger printing and sequence alignment are well-known approaches for document similarity comparison. A fingerprinting method is simple and fast, but it can not find particular similar regions. A string alignment method is used for identifying regions of similarity by arranging the sequences of a string. It has an advantage of finding particular similar regions, but it also has a disadvantage of taking more computing time. The Multi-Level Alignment (MLA) is a new method designed for taking the advantages of both methods. The MLA divides input documents into uniform length blocks, and then extracts fingerprints from each block and calculates similarity of block pairs by comparing the fingerprints. A similarity table is created in this process. Finally, sequence alignment is used for specifying longest similar regions in the similarity table. The MLA allows users to change block’s size to control proportion of the fingerprint algorithm and the sequence alignment. As a document is divided into several blocks, similar regions are also fragmented into two or more blocks. To solve this fragmentation problem, we proposed a united block method. Experimentally, we show that computing document’s similarity with the united block is more accurate than the original MLA method, with minor time loss.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr