Digital Library[ Search Result ]
Effective Generative Chatbot Model Trainable with a Small Dialogue Corpus
Jintae Kim, Hyeon-gu Lee, Harksoo Kim
http://doi.org/10.5626/JOK.2019.46.3.246
Contrary to popular retrieval-based chatbot models, generative chatbot models do not depend on predefined responses, but rather generate new responses based on well-trained neural networks. However, they require a large number of training corpus in the form of query-response pairs. If the training corpus are insufficient, they make grammatical errors emanating from out-of-vocabulary or sparse data problems, mostly in longer sentences. To overcome this challenge, we proposed a chatbot model based on sequence-to-sequence neural network using a mixture of words and syllables as encoding-decoding units. Moreover, we proposed a two-step training procedure involving pre-training using a large non-dialogue corpus and retraining using a smaller dialogue corpus. In the experiment involving small dialogue corpus (47,089 query-response pairs for training and 3,000 query-response pairs for evaluation), the proposed encoding-decoding units resulted to a reduction in out-of-vocabulary problem while the two-step training method led to improved performance measures like BLEU and ROUGE.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr