Journal of KIISE

Search : [ author: Jeong-Won Cha ] (5)

Deep learning has been continually demonstrating excellent performance in the field of natural language processing. However, enormous training data and long training time are required to achieve good performance. Herein, we propose a method that exceeds deep learning performance in a small learning data environment by using a neural-symbolic method for the relationship extraction problem. We have designed a structure that uses the inconsistency between the rule results and deep learning results. In addition, logical rule filtering has been proposed to improve the convergence speed and a context has been added to improve the performance of the rule. The proposed method showed excellent performance for a small amount of training data, and we confirmed that fast performance convergence was achieved.

Image Caption Generation using Object Attention Mechanism

Da-Sol Park, Jeong-Won Cha

http://doi.org/10.5626/JOK.2019.46.4.369

Explosive increases in image data have led studies investigating the role of image caption generation in image expression of natural language. The current technologies for generating Korean image captions contain errors associated with object concurrence attributed to dataset translation from English datasets. In this paper, we propose a model of image caption generation employing attention as a new loss function using the extracted nouns of image references. The proposed method displayed BLEU1 0.686, BLEU2 0.557, BLEU3 0.456, BLEU4 0.372, which proves that the proposed model facilitates the resolution of high-frequency word-pair errors. We also showed that it enhances the performance compared with previous studies and reduces redundancies in the sentences. As a result, the proposed method can be used to generate a caption corpus effectively.

Semi-Supervised Learning for Detecting of Abusive Sentence on Twitter using Deep Neural Network with Fuzzy Category Representation

Da-Sol Park, Jeong-Won Cha

http://doi.org/10.5626/JOK.2018.45.11.1185

The number of people embracing damage caused by hate speech on the SNS(Social Network Service) is increasing rapidly. In this paper, we propose a detection method using Semi-supervised learning and Deep Neural Network from a large file to determine whether implied meaning of sentence beyond hate speech detection through comparison with a simple dictionary in twitter sentence is abusive or not. Most of the methods judge the hate speech sentence by comparing with a blacklist comprising of hate speech words. However, the reported methods have a disadvantage that skillful and subtle expression of hate speech cannot be identified. So, we created a corpus with a label on whether or not to hate speech on Korean twitter sentence. The training corpus in twitter comprised of 44,000 sentences and the test corpus comprised of 13,082 sentences. The system performance about the explicit abusive sentences of the F1 score was 86.13% on the model using 1-layer syllable CNN and sequence vector. And the system performance about the implicit abusive sentences of the F1 score 25.53% on the model using 1-layer syllable CNN and 2-layer syllable CNN and sequence vector. The proposed method can be used as a method for detecting cyber-bullying.

Assignment Semantic Category of a Word using Word Embedding and Synonyms

Da-Sol Park, Jeong-Won Cha

http://doi.org/10.5626/JOK.2017.44.9.946

Semantic Role Decision defines the semantic relationship between the predicate and the arguments in natural language processing (NLP) tasks. The semantic role information and semantic category information should be used to make Semantic Role Decisions. The Sejong Electronic Dictionary contains frame information that is used to determine the semantic roles. In this paper, we propose a method to extend the Sejong electronic dictionary using word embedding and synonyms. The same experiment is performed using existing word-embedding and retrofitting vectors. The system performance of the semantic category assignment is 32.19%, and the system performance of the extended semantic category assignment is 51.14% for words that do not appear in the Sejong electronic dictionary of the word using the word embedding. The system performance of the semantic category assignment is 33.33%, and the system performance of the extended semantic category assignment is 53.88% for words that do not appear in the Sejong electronic dictionary of the vector using retrofitting. We also prove it is helpful to extend the semantic category word of the Sejong electronic dictionary by assigning the semantic categories to new words that do not have assigned semantic categories.

Automatic Correction of Errors in Annotated Corpus Using Kernel Ripple-Down Rules

Tae-Ho Park, Jeong-Won Cha

http://doi.org/

Annotated Corpus is important to understand natural language using machine learning method. In this paper, we propose a new method to automate error reduction of annotated corpora. We use the Ripple-Down Rules(RDR) for reducing errors and Kernel to extend RDR for NLP. We applied our system to the Korean Wikipedia and blog corpus errors to find the annotated corpora error type. Experimental results with various views from the Korean Wikipedia and blog are reported to evaluate the effectiveness and efficiency of our proposed approach. The proposed approach can be used to reduce errors of large corpora.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Journal of KIISE

Journal of KIISE

Digital Library[ Search Result ]

Relation Extraction based on Neural-Symbolic Structure

Image Caption Generation using Object Attention Mechanism

Semi-Supervised Learning for Detecting of Abusive Sentence on Twitter using Deep Neural Network with Fuzzy Category Representation

Assignment Semantic Category of a Word using Word Embedding and Synonyms

Automatic Correction of Errors in Annotated Corpus Using Kernel Ripple-Down Rules

Search

Editorial Office