Search : [ keyword: 사전학습 ] (13)

Korean Dependency Parsing Using Sequence Labeling

Keunha Kim, Youngjoong Ko

http://doi.org/10.5626/JOK.2024.51.12.1053

Dependency parsing is a crucial step in language analysis. It identifies relationships between words within a sentence. Recently, many models based on a pre-trained transformer have shown impressive performances in various natural language processing research. hey have been also applied to dependency parsing. Generally, traditional approaches to dependency parsing using pre-trained models consist of two main stages: 1) merging token-level embeddings generated by the pre-trained model into word-level embeddings; and 2) analyzing dependency relations by comparing or classifying the merged embeddings. However, due to a large number of parameters and additional layers required for embedding construction, comparison, and classification, these models can be inefficient in terms of time and memory usage. This paper proposes a dependency parsing technique based on sequential labeling to improve the efficiency of training and inference by defining dependency parsing units and simplifying model layers. The proposed model eliminates the necessity of the word-level embedding merging step by utilizing special tokens to define parsing units. It also effectively reduces the number of parameters by simplifying model layers. As a result, the training and inference time is significantly shortened. With these optimizations, the proposed model maintains meaningful performance in dependency parsing.

SCA: Improving Document Grounded Response Generation based on Supervised Cross-Attention

Hyeongjun Choi, Seung-Hoon Na, Beomseok Hong, Youngsub Han, Byoung-Ki Jeon

http://doi.org/10.5626/JOK.2024.51.4.326

Document-grounded response generation is the task of aiming at generating conversational responses by “grounding” the factual evidence on task-specific domain, such as consumer consultation or insurance planning, where the evidence is obtained from the retrieved relevant documents in response to a user’s question under the current dialogue context. In this study, we propose supervised cross-attention (SCA) to enhance the ability of the response generation model to find and incorporate “response-salient snippets” (i.e., spans or contents), which are parts of the retrieved document that should be included and maintained in the actual answer generation. SCA utilizes the additional supervised loss that focuses cross-attention weights on the response-salient snippets, and this attention supervision likely enables a decoder to effectively generate a response in a “saliency-grounding” manner, by strongly attending to the important parts in the retrieved document. Experiment results on MultiDoc2Dial show that the use of SCA and additional performance improvement methods leads to the increase of 1.13 in F1 metric over the existing SOTA, and reveals that SCA leads to the increase of 0.25 in F1.

Multi-task Learning Based Re-ranker for External Knowledge Retrieval in Document-grounded Dialogue Systems

Honghee Lee, Youngjoong Ko

http://doi.org/10.5626/JOK.2023.50.7.606

Document-grounded dialogue systems retrieve external passages related to the dialogue and use them to generate an appropriate response to the user"s utterance. However, the retriever based on the dual-encoder architecture records low performance in finding relevant passages, and the re-ranker to complement the retriever is not sufficiently optimized. In this paper, to solve these problems and perform effective external passage retrieval, we propose a re-ranker based on multi-task learning. The proposed model is a cross-encoder structure that simultaneously learns contrastive learning-based ranking, Masked Language Model (MLM), and Posterior Differential Regularization (PDR) in the fine-tuning stage, enhancing language understanding ability and robustness of the model through auxiliary tasks of MLM and PDR. Evaluation results on the Multidoc2dial dataset show that the proposed model outperforms the baseline model in Recall@1, Recall@5, and Recall@10.

Solving Korean Math Word Problems Using the Graph and Tree Structure

Kwang Ho Bae, Sang Yeop Yeo, Yu Chul Jung

http://doi.org/10.5626/JOK.2022.49.11.972

In previous studies, there have been various efforts to solve math word problems in the English sentence system. In many studies, improved performance was achieved by introducing structures such as trees and graphs, beyond the Sequence-to-Sequence approaches. However, in the study of solving math problems in Korean sentence systems, there are no model cases, using structures such as trees or graphs. Thus, in this paper, we examine the possibility of solving math problems in Korean sentence systems for models using the tree structure, graph structure, and Korean pre-training language models together. Our experimental results showed that accuracy improved by approximately 20%, compared to the model of the Seq2seq structure, by introducing the graph and tree structure. Additionally, the use of the Korean pre-training language model showed an accuracy improvement of 4.66%-5.96%.

Entity Graph Based Dialogue State Tracking Model with Data Collection and Augmentation for Spoken Conversation

Haeun Yu, Youngjoong Ko

http://doi.org/10.5626/JOK.2022.49.10.891

As a part of a task-oriented dialogue system, dialogue state tracking is a task for understanding the dialogue and extracting user’s need in a slot-value form. Recently, Dialogue System Track Challenge (DSTC) 10 Track 2 initiated a challenge to measure the robustness of a dialogue state tracking model in a spoken conversation setting. The released evaluation dataset has three characteristics: new multiple value scenario, three-times more entities, and utterances from automatic speech recognition module. In this paper, to ensure the model’s robust performance, we introduce an extraction-based dialogue state tracking model with entity graph. We also propose to use data collection and template-based data augmentation method. Evaluation results prove that our proposed method improves the performance of the extraction-based dialogue state tracking model by 1.7% of JGA and 0.57% of slot accuracy compared to baseline model.

Quality Estimation of Machine Translation using Dual-Encoder Architecture

Dam Heo, Wonkee Lee, Jong-Hyeok Lee

http://doi.org/10.5626/JOK.2022.49.7.521

Quality estimation (QE) is the task of estimating the quality of given machine translations (MTs) without their reference translations. A recent research trend is to apply transfer learning to a pre-training model based on Transformer encoder with a parallel corpus in QE. In this paper, we proposed a dual-encoder architecture that learns a monolingual representation of each respective language in encoders. Thereafter, it learns a cross-lingual representation of each language in cross-attention networks. Thus, it overcomes the limitations of a single-encoder architecture in cross-lingual tasks, such as QE. We proved that the dual-encoder architecture is structurally more advantageous over the single-encoder architecture and furthermore, improved the performance and stability of the dual-encoder model in QE by applying the pre-trained language model to the dual-encoder model. Experiments were conducted on WMT20 QE data for En-De pair. As pre-trained models, our model employs English BERT (Bidirectional Encoder Representations from Transformers) and German BERT to each encoder and achieves the best performance.

Effective Transfer Learning in Text Classification with the Label-Based Discriminative Feature Learning

Gyunyeop Kim, Sangwoo Kang

http://doi.org/10.5626/JOK.2022.49.3.214

The performance of the natural language processing with transfer learning methodology has improved by pre-training language models with a large amount of general data and applying them on downstream tasks. However, the problem is that it learns general features rather than those specific to the downstream tasks as the data used in pre-training is irrelevant to the downstream tasks. This paper proposes a novel learning method for embeddings of pre-trained models to learn specific features of the downstream tasks. The proposed method is to learn the label feature of the downstream tasks through contrast learning with label embedding and sampled data pairs. To demonstrate the performance of the proposed method, we conducted experiments on sentence classification datasets and evaluated whether features of downstream tasks have been learned through PCA(Principal component analysis) and clustering on embeddings.

Korean Text Summarization using MASS with Copying and Coverage Mechanism and Length Embedding

Youngjun Jung, Changki Lee, Wooyoung Go, Hanjun Yoon

http://doi.org/10.5626/JOK.2022.49.1.25

Text summarization is a technology that generates a summary including important and essential information from a given document, and an end-to-end abstractive summarization model using a sequence-to-sequence model is mainly studied. Recently, a transfer learning method that performs fine-tuning using a pre-training model based on large-scale monolingual data has been actively studied in the field of natural language processing. In this paper, we applied the copying mechanism method to the MASS model, conducted pre-training for Korean language generation, and then applied it to Korean text summarization. In addition, coverage mechanism and length embedding were additionally applied to improve the summarization model. As a result of the experiment, it was shown that the Korean text summarization model, which applied the copying and coverage mechanism method to the MASS model, showed a higher performance than the existing models, and that the length of the summary could be adjusted through length embedding.

Combining Sentiment-Combined Model with Pre-Trained BERT Models for Sentiment Analysis

Sangah Lee, Hyopil Shin

http://doi.org/10.5626/JOK.2021.48.7.815

It is known that BERT can capture various linguistic knowledge from raw text via language modeling without using any additional hand-crafted features. However, some studies have shown that BERT-based models with an additional use of specific language knowledge have higher performance for natural language processing problems associated with that knowledge. Based on such finding, we trained a sentiment-combined model by adding sentiment features to the BERT structure. We constructed sentiment feature embeddings using sentiment polarity and intensity values annotated in a Korean sentiment lexicon and proposed two methods (external fusing and knowledge distillation) to combine sentiment-combined model with a general-purpose BERT pre-trained model. The external fusing method resulted in higher performances in Korean sentiment analysis tasks with movie reviews and hate speech datasets than baselines from other pre-trained models not fused with sentiment-combined models. We also observed that adding sentiment features to the BERT structure improved the model’s language modeling and sentiment analysis performance. Furthermore, when implementing sentiment-combined models, training time and cost could be decreased by using a small-scale BERT model with a small number of layers, dimensions, and steps.

Evaluating of Korean Machine Reading Comprehension Generalization Performance via Cross-, Blind and Open-Domain QA Dataset Assessment

Joon-Ho Lim, Hyun-ki Kim

http://doi.org/10.5626/JOK.2021.48.3.275

Machine reading comprehension (MRC) entails identification of the correct answer in a paragraph when a natural language question and paragraph are provided. Recently, fine-tuning based on a pre-trained language model yields the best performance. In this study, we evaluated the ability of machine-reading comprehension method to generalize question and paragraph pairs, rather than similar training sets. Towards this end, the cross-evaluation between datasets and blind evaluation was performed. The results showed a correlation between generalization performance and datasets such as answer length and overlap ratio between question and paragraph. As a result of blind evaluation, the evaluation dataset with the long answer and low lexical overlap between the questions and paragraphs resulted in less than 80% performance. Finally, the generalized performance of the MRC model under the open domain QA environment was evaluated, and the performance of the MRC using the searched paragraph was found to be degraded. According to the MRC task characteristics, the difficulty and differences in generalization performance depend on the relationship between the question and the answer, suggesting the need for analysis of different evaluation sets.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr