Digital Library[ Search Result ]
Document Summarization Using TextRank Based on Sentence Embedding
Seok-won Jeong, Jintae Kim, Harksoo Kim
http://doi.org/10.5626/JOK.2019.46.3.285
Document summarization is creating a short version document that maintains the main content of original document. An extractive summarization has been actively studied by the reason of it guarantees the basic level of grammar and high level of accuracy by copying a large amount of text from the original document. It is difficult to consider the meaning of sentences because the TextRank, which is a typical extractive summarization method, calculates an edge of graph through the frequency of words. In a bid to solve these drawbacks, we propose a new TextRank using sentence embedding. Through experiments, we confirmed that the proposed method can consider the meaning of the sentence better than the existing method.
Effective Generative Chatbot Model Trainable with a Small Dialogue Corpus
Jintae Kim, Hyeon-gu Lee, Harksoo Kim
http://doi.org/10.5626/JOK.2019.46.3.246
Contrary to popular retrieval-based chatbot models, generative chatbot models do not depend on predefined responses, but rather generate new responses based on well-trained neural networks. However, they require a large number of training corpus in the form of query-response pairs. If the training corpus are insufficient, they make grammatical errors emanating from out-of-vocabulary or sparse data problems, mostly in longer sentences. To overcome this challenge, we proposed a chatbot model based on sequence-to-sequence neural network using a mixture of words and syllables as encoding-decoding units. Moreover, we proposed a two-step training procedure involving pre-training using a large non-dialogue corpus and retraining using a smaller dialogue corpus. In the experiment involving small dialogue corpus (47,089 query-response pairs for training and 3,000 query-response pairs for evaluation), the proposed encoding-decoding units resulted to a reduction in out-of-vocabulary problem while the two-step training method led to improved performance measures like BLEU and ROUGE.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr