Journal of KIISE

Search : [ author: 최수정 ] (6)

Knowledge bases often suffer from their limited applicability due to missing information in their entities and relations. Link prediction has been investigated to complete the missing information and makes a knowledge base more useful. The existing studies on link prediction often rely on knowledge graph embedding and have shown trade-off in their performance. In this paper, we propose an ensemble model for knowledge graph embedding to improve quality of link prediction. The proposed model combines multiple knowledge graph embeddings that have unique characteristics. In this way, the ensemble model is able to consider various aspects of the entries within a knowledge base and reduce the variation of accuracy depending on hyper-parameters. Our experiment shows that the proposed model outperforms other knowledge graph embedding methods by 13.5% on WN18 and FB15K dataset.

Single Sentence Summarization with an Event Word Attention Mechanism

Ian Jung, Su Jeong Choi, Seyoung Park

http://doi.org/10.5626/JOK.2020.47.2.155

The purpose of summarization is to generate short text that preserves important information in the source sentences. There are two approaches for the summarization task. One is an extractive approach and other is an abstractive approach. The extractive approach is to determine if words in a source sentence are retained or not. The abstractive approach generates the summary of a given source sentence using the neural network such as the sequence-to-sequence model and the pointer-generator. However, these approaches present a problem because such approaches omit important information such as event words. This paper proposes an event word attention mechanism for sentence summarization. Event words serve as the key meaning of a given source sentence, since they express what occurs in the source sentence. The event word attention weights are calculated by event information of each words in the source sentence and then it combines global attention mechanism. For evaluation, we used the English and Korean dataset. Experimental results show that, the model of adopting event attention outperforms the existing models.

News Stream Summarization for an Event based on Timeline

Ian Jung, Su Jeong Choi, Seyoung Park

http://doi.org/10.5626/JOK.2019.46.11.1140

This paper explores the summarization task in news stream, as it is continuously produced and has sequential characteristic. Timeline based summarization is widely adopted in news stream summarization because timeline can represent events sequentially. However, previous work relies on the time of collection of news article, thus they cannot consider for dates other than out of the collected period. In addition, previous work lacked consideration of conciseness, informativeness, and coherence. To address these problems, we propose a news stream summarization model with an expanded timeline. The model takes into consideration the expanded timeline by using time points that are referenced in given news articles and selects sentences that are concise, informative and consistent with neighboring time points. First, we constitute expanded timeline by selecting dates which are from all identified time points in the news articles. Then, we extract sentences as summary with consideration of informativeness based on keyword for each time points, and on coherence between two consecutive time points, and on continuity of named entities except for long sentence in the articles. Experimental results show that the proposed model generated higher quality summaries compared to previous work.

Solving for Redundant Repetition Problem of Generating Summarization using Decoding History

Jaehyun Ryu, Yunseok Noh, Su Jeong Choi, Seyoung Park, Seong-Bae Park

http://doi.org/10.5626/JOK.2019.46.6.535

Neural attentional sequence-to-sequence models have achieved great success in abstractive summarization. However, the model is limited by several challenges including repetitive generation of words, phrase and sentences in the decoding step. Many studies have attempted to address the problem by modifying the model structure. Although the consideration of actual history of word generation is crucial to reduce word repetition, these methods, however, do not consider the decoding history of generated sequence. In this paper, we propose a new loss function, called ‘Repeat Loss’ to avoid repetitions. The Repeat Loss directly prevents the model from repetitive generation of words by giving a loss penalty to the generation probability of words already generated in the decoding history. Since the propose Repeat Loss does not need a special network structure, the loss function is applicable to any existing sequence-to-sequence models. In experiments, we applied the Repeat Loss to a number of sequence-to-sequence model based summarization systems and trained them on both Korean and CNN/Daily Mail summarization datasets. The results demonstrate that the proposed method reduced repetitions and produced high-quality summarization.

Scene Generation from a Sentence by Learning Object Relation

Yongmin Shin, Su Jeong Choi, Seong-Bae Park, Seyoung Park

http://doi.org/10.5626/JOK.2019.46.5.431

In communication between humans and machines, location information is crucial. However, it is sometimes omitted. While humans can infer omitted information, machines cannot. Thus, certain problems can occur when generating scenes from sentences. In order to solve this problem, previous studies have found an explicit relation in the sentence, then inferred an implicit relation by using prior probability. However, such methods are not suitable for Korean, as it has morphologically productivity. In this paper, we suggest a scene-generation method for Korean. Frist, we find an explicit relation by using an RNN-based artificial neural network. Then, to infer implicit information, we use the prior probability of relations. Finally, we prepare a scene tree with the obtained information, then generate a scene using that tree. In order to evaluate the scene generation, we measure the accuracy of the model dealing with the relationship and assign a human score to the generated scene. As a result, the method is proven to be effective with excellent performance and evaluation.

Improving The Performance of Triple Generation Based on Distant Supervision By Using Semantic Similarity

Hee-Geun Yoon, Su Jeong Choi, Seong-Bae Park

http://doi.org/

The existing pattern-based triple generation systems based on distant supervision could be flawed by assumption of distant supervision. For resolving flaw from an excessive assumption, statistics information has been commonly used for measuring confidence of patterns in previous studies. In this study, we proposed a more accurate confidence measure based on semantic similarity between patterns and properties. Unsupervised learning method, word embedding and WordNet-based similarity measures were adopted for learning meaning of words and measuring semantic similarity. For resolving language discordance between patterns and properties, we adopted CCA for aligning bilingual word embedding models and a translation-based approach for a WordNet-based measure. The results of our experiments indicated that the accuracy of triples that are filtered by the semantic similarity-based confidence measure was 16% higher than that of the statistics-based approach. These results suggested that semantic similarity-based confidence measure is more effective than statistics-based approach for generating high quality triples.

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr

Journal of KIISE

Journal of KIISE

Digital Library[ Search Result ]

A Knowledge Graph Embedding-based Ensemble Model for Link Prediction

Single Sentence Summarization with an Event Word Attention Mechanism

News Stream Summarization for an Event based on Timeline

Solving for Redundant Repetition Problem of Generating Summarization using Decoding History

Scene Generation from a Sentence by Learning Object Relation

Improving The Performance of Triple Generation Based on Distant Supervision By Using Semantic Similarity

Search

Editorial Office