Search : [ keyword: Sequence-to-Sequence ] (14)

Response-Considered Query Token Importance Weight Calculator with Potential Response for Generating Query-Relevant Responses

So-Eon Kim, Choong Seon Hong, Seong-Bae Park

http://doi.org/10.5626/JOK.2022.49.8.601

The conversational response generator(CRG) has made great progress through the sequence-to-sequence model, but it often generates an over-general response which can be a response to all queries or an inappropriate response. Some efforts have been made to modify the traditional loss function to solve this problem and reduce the generation of irrelevant responses to the query by solving the problem of the lack of background knowledge of the CRG, but they did not solve both problems. This paper propose the use of a query token importance calculator because the cause of generating unrelated and overly general responses is that the CRG does not capture the core of the query. Also, based on the theory that the questioner induces a specific response from the listener and designs the speech, this paper proposes to use the golden response to understand the core meaning of the query. The qualitative evaluation confirmed that the response generator using the proposed model was able to generate responses related to the query compared to the model that did not use the proposed model.

A Deep Learning-based Two-Steps Pipeline Model for Korean Morphological Analysis and Part-of-Speech Tagging

Jun Young Youn, Jae Sung Lee

http://doi.org/10.5626/JOK.2021.48.4.444

Recent studies on Korean morphological analysis using artificial neural networks have usually performed morpheme segmentation and part-of-speech tagging as the first step with the restoration of the original form of morphemes by using a dictionary as the postprocessing step. In this study, we have divided the morphological analysis into two steps: the original form of a morpheme is restored first by using the sequence-to-sequence model, and then morpheme segmentation and part-of-speech tagging are performed by using BERT. Pipelining these two steps showed comparable performance to other approaches, even without using a morpheme restoring dictionary that requires rules or compound tag processing.

Dual-Use Encoder-Decoder Model for License Plate Recognition

Chunduck Park, Baeksop Kim

http://doi.org/10.5626/JOK.2021.48.1.51

Due to the rapid, continuous development of machine learning, the neural network model shows high performance in the field of license plate recognition. The most important factors in the performance of machine learning are the data and the model. Most license plate datasets are only given character sequence labels. In such cases, an encoder-decoder model is typically used to recognize character sequences. Detection-based models are better than encoder-decoder models, but they can only be used when character-bounding box labels are included in the dataset, which requires a high labeling cost. In this paper, we suggest an encoder-decoder model that can be used regardless of the presence or absence of character bounding box labels. It includes a combination of the Resnet [1] encoder and the Transformer [2] decoder. The proposed model not only achieves high recognition performance in the absence of character-bounding box labels, but also improves the performance by exploiting bounding box labels when they are available. In case of the Taiwan AOLP [3] dataset containing character sequence labels only, the proposed model shows 99.55% accuracy, which is higher than those of the conventional methods. Further, for the Korean KarPlate [4] dataset which includes additional character bounding box labels, the accuracy of the proposed model is 98.82%. This is still slightly higher than those of the conventional methods, but it is worth noting that the accuracy is improved to 99.25% when the data without character bounding box labels are added.

Performance Analysis of Korean Morphological Analyzer based on Transformer and BERT

Yongseok Choi, Kong Joo Lee

http://doi.org/10.5626/JOK.2020.47.8.730

This paper introduces a Korean morphological analyzer using the Transformer, which is one of the most popular sequence-to-sequence deep neural models. The Transformer comprises an encoder and a decoder. The encoder compresses a raw input sentence into a fixed-size vector, while the decoder generates a morphological analysis result for the vector. We also replace the encoder with BERT, a pre-trained language representation model. An attention mechanism and a copying mechanism are integrated in the decoder. The processing units of the encoder and the decoder are eojeol-based WordPiece and morpheme-based WordPiece, respectively. Experimental results showed that the Transformer with fine-tuned BERT outperforms the randomly initialized Transformer by 2.9% in the F1 score. We also investigated the effects of the WordPiece embedding on morphological analysis when they are not fully updated in the training phases.

A Product Review Summarization Considering Additional Information

Jaeyeun Yoon, Ig-hoon Lee, Sang-goo Lee

http://doi.org/10.5626/JOK.2020.47.2.180

Automatic document summarization is a task that generates the document in a suitable form from an existing document for a certain user or occasion. As use of the Internet increases, the various data including texts are exploding and the value of document summarization technology is growing. While the latest deep learning-based models show reliable performance in document summarization, the problem is that performance depends on the quantity and quality of the training data. For example, it is difficult to generate reliable summarization with existing models from the product review text of online shopping malls because of typing errors and grammatically wrong sentences. Online malls and portal web services are struggling to solve this problem. Thus, to generate an appropriate document summary in poor condition relative to quality and quantity of the product review learning data, this study proposes a model that generates product review summaries with additional information. We found through experiments that this model showed improved performances in terms of relevance and readability than the existing model for product review summaries.

Single Sentence Summarization with an Event Word Attention Mechanism

Ian Jung, Su Jeong Choi, Seyoung Park

http://doi.org/10.5626/JOK.2020.47.2.155

The purpose of summarization is to generate short text that preserves important information in the source sentences. There are two approaches for the summarization task. One is an extractive approach and other is an abstractive approach. The extractive approach is to determine if words in a source sentence are retained or not. The abstractive approach generates the summary of a given source sentence using the neural network such as the sequence-to-sequence model and the pointer-generator. However, these approaches present a problem because such approaches omit important information such as event words. This paper proposes an event word attention mechanism for sentence summarization. Event words serve as the key meaning of a given source sentence, since they express what occurs in the source sentence. The event word attention weights are calculated by event information of each words in the source sentence and then it combines global attention mechanism. For evaluation, we used the English and Korean dataset. Experimental results show that, the model of adopting event attention outperforms the existing models.

Korean Morphological Analyzer for Neologism and Spacing Error based on Sequence-to-Sequence

Byeongseo Choe, Ig-hoon Lee, Sang-goo Lee

http://doi.org/10.5626/JOK.2020.47.1.70

In order to analyze Internet text data from Korean internet communities, it is necessary to accurately perform morphological analysis even in a sentence with a spacing error and adequate restoration of original form for an out-of-vocabulary input. However, the existing Korean morphological analyzer often uses dictionaries and complicate preprocessing for the restoration. In this paper, we propose a Korean morphological analyzer model which is based on the sequence-to-sequence model. The model can effectively handle the spacing problem and OOV problem. In addition, the model uses syllable bigram and grapheme as additional input features. The proposed model does not use a dictionary and minimizes rule-based preprocessing. The proposed model showed better performance than other morphological analyzers without a dictionary in the experiment for Sejong corpus. Also, better performance was evident for the dataset without space and sample dataset collected from Internet.

Solving for Redundant Repetition Problem of Generating Summarization using Decoding History

Jaehyun Ryu, Yunseok Noh, Su Jeong Choi, Seyoung Park, Seong-Bae Park

http://doi.org/10.5626/JOK.2019.46.6.535

Neural attentional sequence-to-sequence models have achieved great success in abstractive summarization. However, the model is limited by several challenges including repetitive generation of words, phrase and sentences in the decoding step. Many studies have attempted to address the problem by modifying the model structure. Although the consideration of actual history of word generation is crucial to reduce word repetition, these methods, however, do not consider the decoding history of generated sequence. In this paper, we propose a new loss function, called ‘Repeat Loss’ to avoid repetitions. The Repeat Loss directly prevents the model from repetitive generation of words by giving a loss penalty to the generation probability of words already generated in the decoding history. Since the propose Repeat Loss does not need a special network structure, the loss function is applicable to any existing sequence-to-sequence models. In experiments, we applied the Repeat Loss to a number of sequence-to-sequence model based summarization systems and trained them on both Korean and CNN/Daily Mail summarization datasets. The results demonstrate that the proposed method reduced repetitions and produced high-quality summarization.

Resolution of Answer-Repetition Problems in a Generative Question-Answering Chat System

Sihyung Kim, Harksoo Kim

http://doi.org/10.5626/JOK.2018.45.9.925

A question-answering (QA) chat system is a chatbot that responds to simple factoid questions by retrieving information from knowledge bases. Recently, many chat systems based on sequence-to-sequence neural networks have been implemented and have shown new possibilities for generative models. However, the generative chat systems have word repetition problems, in that the same words in a response are repeatedly generated. A QA chat system also has similar problems, in that the same answer expressions frequently appear for a given question and are repeatedly generated. To resolve this answer-repetition problem, we propose a new sequence-to-sequence model reflecting a coverage mechanism and an adaptive control of attention (ACA) mechanism in a decoder. In addition, we propose a repetition loss function reflecting the number of unique words in a response. In the experiments, the proposed model performed better than various baseline models on all metrics, such as accuracy, BLEU, ROUGE-1, ROUGE-2, ROUGE-L, and Distinct-1.

Regularizing Korean Conversational Model by Applying Denoising Mechanism

Tae-Hyeong Kim, Yunseok Noh, Seong-Bae Park, Se-Yeong Park

http://doi.org/10.5626/JOK.2018.45.6.572

A conversation system is a system that responds appropriately to input utterances. Recently, the sequence-to-sequence framework has been widely used as a conversation-learning model. However, the conversation model learned in such a way often generates a safe and dull response that does not provide appropriate information or sophisticated meaning. In addition, this model is also useless for input utterances appearing in various forms, such as with changed ending words or changed word order. To solve these problems, we propose a denoising response generation model applying a denoising mechanism. By injecting noise into original input, the proposed method creates a model that will stochastically experience new input made up of items that were not included in the original data during the training process. This data augmentation effect regularizes the model and allows the realization of a robust model. We evaluate our model using 90k input utterances-responses from Korean conversation pair data. The proposed model achieves better results compared to a baseline model on both ROUGE F1 score and qualitative evaluations by human annotators.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr