Digital Library[ Search Result ]
Lexical Substitution Using a Replaced Token Detection Model
http://doi.org/10.5626/JOK.2023.50.4.321
Substitutes in a sentence are words that do not change the meaning of the sentence if substituted. The task of substitution, also known as lexical substitution, can be applied to various natural language processing tasks, such as data augmentation. Traditional methods for lexical substitution may generate unnatural substitutes. To solve this problem, we propose a new method of lexical substitution. Our method samples sentences containing the target word from a corpus, inputs these sentences to the substitutes generator, which is based on the pretrained BERT, and excludes unacceptable candidates with the replaced token detection model. Verifying the proposed method with the open corpus provided by the National Institute of Korean Language and the Natmal synonym dictionary, our method extracts more accurate substitutes than traditional methods. Also, it is found that the replaced token detection model, which is proposed for lexical substitution, performs better in our experiment than the model learned by using the CoLA dataset, which can be considered to exclude unacceptable candidates.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr