TY - JOUR T1 - A Study on Improving the Accuracy of Korean Speech Recognition Texts Using KcBERT AU - Min, Donguk AU - Nam, Seungsoo AU - Choi, Daeseon JO - Journal of KIISE, JOK PY - 2024 DA - 2024/1/14 DO - 10.5626/JOK.2024.51.12.1115 KW - automatic speech recognition KW - korean speech processing KW - speech-to-text KW - BERT AB - In the field of speech recognition, models such as Whisper, Wav2Vec2.0, and Google STT are widely utilized. However, Korean speech recognition faces challenges because complex phonological rules and diverse pronunciation variations hinder performance improvements. To address these issues, this study proposed a method that combined the Whisper model with a post-processing approach using KcBERT. By applying KcBERT’s bidirectional contextual learning to text generated by the Whisper model, the proposed method could enhance contextual coherence and refine the text for greater naturalness. Experimental results showed that post-processing reduced the Character Error Rate (CER) from 5.12% to 1.88% in clean environments and from 22.65% to 10.17% in noisy environments. Furthermore, the Word Error Rate (WER) was significantly improved, decreasing from 13.29% to 2.71% in clean settings and from 38.98% to 11.15% in noisy settings. BERTScore also exhibited overall improvement. These results demonstrate that the proposed approach is effective in addressing complex phonological rules and maintaining text coherence within Korean speech recognition.