Digital Library[ Search Result ]
An Automatic Method of Generating a Large-Scale Train Set for Bi-LSTM based Sentiment Analysis
http://doi.org/10.5626/JOK.2019.46.8.800
Sentiment analysis using deep learning requires a large-scale train set labeled sentiment. However, direct labeling of sentiment by humans is time and cost-constrained, and it is not easy to collect the required data for sentiment analysis from many data. In the present work, to solve the existing problems, the existing sentiment lexicon was used to assign sentiment score, and when there was sentiment transformation element, the sentiment score was reset through dependency parsing and morphological analysis for automatic generation of large-scale train set labeled with the sentiment. The Top-k data with high sentiment score was extracted. Sentiment transformation elements include sentiment reversal, sentiment activation, and sentiment deactivation. Our experimental results reveal the generation of a large-scale train set in a shorter time than manual labeling and improvement in the performance of deep learning with an increase in the amount of train set. The accuracy of the model using only sentiment lexicon was 80.17% and the accuracy of the proposed model, which includes natural language processing technology was 89.17%. Overall, a 9% improvement was observed.
A Robust Three-Factor User Authentication Scheme based on Elliptic Curve Cryptography and Fuzzy Extractor
Trung Thanh Ngo, Tae-Young Choe
http://doi.org/10.5626/JOK.2019.46.6.587
A three-factor user authentication is appropriate to ensure a high degree of authentication. Fan and Lin proposed a typical three-factor authentication scheme, which requires token, password, and fingerprint. The scheme does not allow authentication in the absence of any of the three factors. Unfortunately, Fan and Lin"s scheme is associated with security risks such as vulnerability to insider attacks, stolen-verifier attacks, and message modification attacks. Yeh et al. proposed a three-factor user authentication, which overcomes such pitfalls and improves security and performance using elliptic curve cryptography. We found that Yeh et al."s scheme is still vulnerable to user impersonation attacks and server masquerading attacks. We propose a robust three-factor authentication scheme entailing server smart cards, elliptic curve cryptography, and a fuzzy extractor that address the foregoing flaws and result in enhanced security. The proposed scheme is resistant to various attacks and improves system performance. BAN logic is used to prove that the scheme establishes a secure channel.
Mobile Gamer Categorization with Archetypal Analysis and Cognitive-Psychological Features from Log Data
Jihoon Jeon, Dumim Yoon, Seongil Yang, Kyungjoong Kim
http://doi.org/10.5626/JOK.2018.45.3.234
The study of classifying gamer types or analyzing the characteristics of gamers is a field of interest for data analysis researchers. From the past to the present, much research has been done on gamer categorization and gamer analysis. However, most studies use surveys or bio-signals, which is not practical because it is difficult to obtain large amounts of data. Even if the game log is used, it is difficult to analyze the psychology of the gamer because the gamer is categorized and analyzed by extracting only statistical values. However, if we can extract the cognitive psychology information of the gamer from the basic game log, we can analyze the gamer more intuitively and easily. In this paper, we extracted eight cognitive psychological features representing the behavior and psychological information of the gamer using Crazy Dragon"s game log, which is a mobile Role-Playing-Game (RPG). In addition, we classified gamers based upon cognitive psychological features and analyzed them using eight cognitive psychological features. As a result, most gamers were highly correlated with one or two types.
Automated Modelling of Ontology Schema for Media Classification
Nam-Gee Lee, Hyun-Kyu Park, Young-Tack Park
With the personal-media development that has emerged through various means such as UCC and SNS, many media studies have been completed for the purposes of analysis and recognition, thereby improving the object-recognition level. The focus of these studies is a classification of media that is based on a recognition of the corresponding objects, rather than the use of the title, tag, and scripter information. The media-classification task, however, is intensive in terms of the consumption of time and energy because human experts need to model the underlying media ontology. This paper therefore proposes an automated approach for the modeling of the media-classification ontology schema; here, the OWL-DL Axiom that is based on the frequency of the recognized media-based objects is considered, and the automation of the ontology modeling is described. The authors conducted media-classification experiments across 15 YouTube-video categories, and the media-classification accuracy was measured through the application of the automated ontology-modeling approach. The promising experiment results show that 1500 actions were successfully classified from 15 media events with an 86 % accuracy.
Sequence-to-sequence based Morphological Analysis and Part-Of-Speech Tagging for Korean Language with Convolutional Features
Jianri Li, EuiHyeon Lee, Jong-Hyeok Lee
Traditional Korean morphological analysis and POS tagging methods usually consist of two steps: 1 Generat hypotheses of all possible combinations of morphemes for given input, 2 Perform POS tagging search optimal result. require additional resource dictionaries and step could error to the step. In this paper, we tried to solve this problem end-to-end fashion using sequence-to-sequence model convolutional features. Experiment results Sejong corpus sour approach achieved 97.15% F1-score on morpheme level, 95.33% and 60.62% precision on word and sentence level, respectively; s96.91% F1-score on morpheme level, 95.40% and 60.62% precision on word and sentence level, respectively.
Automatic Correction of Errors in Annotated Corpus Using Kernel Ripple-Down Rules
Annotated Corpus is important to understand natural language using machine learning method. In this paper, we propose a new method to automate error reduction of annotated corpora. We use the Ripple-Down Rules(RDR) for reducing errors and Kernel to extend RDR for NLP. We applied our system to the Korean Wikipedia and blog corpus errors to find the annotated corpora error type. Experimental results with various views from the Korean Wikipedia and blog are reported to evaluate the effectiveness and efficiency of our proposed approach. The proposed approach can be used to reduce errors of large corpora.
Probabilistic Segmentation and Tagging of Unknown Words
Processing of unknown words such as proper nouns and newly coined words is important for a morphological analyzer to process documents in various domains. In this study, a segmentation and tagging method for unknown Korean words is proposed for the 3-step probabilistic morphological analysis. For guessing unknown word, it uses rich suffixes that are attached to open class words, such as general nouns and proper nouns. We propose a method to learn the suffix patterns from a morpheme tagged corpus, and calculate their probabilities for unknown open word segmentation and tagging in the probabilistic morphological analysis model. Results of the experiment showed that the performance of unknown word processing is greatly improved in the documents containing many unregistered words.
Syllable-based Korean POS Tagging Based on Combining a Pre-analyzed Dictionary with Machine Learning
Chung-Hee Lee, Joon-Ho Lim, Soojong Lim, Hyun-Ki Kim
This study is directed toward the design of a hybrid algorithm for syllable-based Korean POS tagging. Previous syllable-based works on Korean POS tagging have relied on a sequence labeling method and mostly used only a machine learning method. We present a new algorithm integrating a machine learning method and a pre-analyzed dictionary. We used a Sejong tagged corpus for training and evaluation. While the machine learning engine achieved eojeol precision of 0.964, the proposed hybrid engine achieved eojeol precision of 0.990. In a Quiz domain test, the machine learning engine and the proposed hybrid engine obtained 0.961 and 0.972, respectively. This result indicates our method to be effective for Korean POS tagging.
Ontology Modeling and Rule-based Reasoning for Automatic Classification of Personal Media
Hyun-Kyu Park, Chi-Seung So, Young-Tack Park
Recently personal media were produced in a variety of ways as a lot of smart devices have been spread and services using these data have been desired. Therefore, research has been actively conducted for the media analysis and recognition technology and we can recognize the meaningful object from the media. The system using the media ontology has the disadvantage that can’t classify the media appearing in the video because of the use of a video title, tags, and script information. In this paper, we propose a system to automatically classify video using the objects shown in the media data. To do this, we use a description logic-based reasoning and a rule-based inference for event processing which may vary in order. Description logic-based reasoning system proposed in this paper represents the relation of the objects in the media as activity ontology. We describe how to another rule-based reasoning system defines an event according to the order of the inference activity and order based reasoning system automatically classify the appropriate event to the category. To evaluate the efficiency of the proposed approach, we conducted an experiment using the media data classified as a valid category by the analysis of the Youtube video.
Error Correction in Korean Morpheme Recovery using Deep Learning
Korean Morphological Analysis is a difficult process. Because Korean is an agglutinative language, one of the most important processes in Morphological Analysis is Morpheme Recovery. There are some methods using Heuristic rules and Pre-Analyzed Partial Words that were examined for this process. These methods have performance limits as a result of not using contextual information. In this study, we built a Korean morpheme recovery system using deep learning, and this system used word embedding for the utilization of contextual information. In ‘들/VV’ and ‘듣/VV’ morpheme recovery, the system showed 97.97% accuracy, a better performance than with SVM(Support Vector Machine) which showed 96.22% accuracy.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr