Automatic Text Summarization Based on Selective OOV Copy Mechanism with BERT Embedding 


Vol. 47,  No. 1, pp. 36-44, Jan.  2020
10.5626/JOK.2020.47.1.36


PDF

  Abstract

Automatic text summarization is a process of shortening a text document via extraction or abstraction. Abstractive text summarization involves using pre-generated word embedding information. Low-frequency but salient words such as terminologies are seldom included in dictionaries, that are so called, out-of-vocabulary (OOV) problems. OOV deteriorates the performance of the encoder-decoder model in the neural network. To address OOV words in abstractive text summarization, we propose a copy mechanism to facilitate copying new words in the target document and generating summary sentences. Different from previous studies, the proposed approach combines accurately pointing information, selective copy mechanism, embedded by BERT, randomly masking OOV, and converting sentences from morpheme. Additionally, the neural network gate model to estimate the generation probability and the loss function to optimize the entire abstraction model was applied. Experimental results demonstrate that ROUGE-1 (based on word recall) and ROUGE-L (longest used common subsequence) of the proposed encoding-decoding model have been improved at 54.97 and 39.23, respectively.


  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

T. Lee and S. Kang, "Automatic Text Summarization Based on Selective OOV Copy Mechanism with BERT Embedding," Journal of KIISE, JOK, vol. 47, no. 1, pp. 36-44, 2020. DOI: 10.5626/JOK.2020.47.1.36.


[ACM Style]

Tae-Seok Lee and Seung-Shik Kang. 2020. Automatic Text Summarization Based on Selective OOV Copy Mechanism with BERT Embedding. Journal of KIISE, JOK, 47, 1, (2020), 36-44. DOI: 10.5626/JOK.2020.47.1.36.


[KCI Style]

이태석, 강승식, "BERT 임베딩과 선택적 OOV 복사 방법을 사용한 문서요약," 한국정보과학회 논문지, 제47권, 제1호, 36~44쪽, 2020. DOI: 10.5626/JOK.2020.47.1.36.


[Endnote/Zotero/Mendeley (RIS)]  Download


[BibTeX]  Download



Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr