Search : [ keyword: 음성 중첩 ] (1)

Creating a of Noisy Environment Speech Mixture Dataset for Korean Speech Separation

Jaehoo Jang, Kun Park, Jeongpil Lee, Myoung-Wan Koo

http://doi.org/10.5626/JOK.2024.51.6.513

In the field of speech separation, models are typically trained using datasets that contain mixtures of speech and overlapping noise. Although there are established international datasets for advancing speech separation techniques, Korea currently lacks a similar precedent for constructing datasets with overlapping speech and noise. Therefore, this paper presents a dataset generator specifically designed for single-channel speech separation models tailored to the Korean language. The Korean Speech mixture with Noise dataset is introduced, which has been constructed using this generator. In our experiments, we train and evaluate a Conv-TasNet speech separation model using the newly created dataset. Additionally, we verify the dataset's efficacy by comparing the Character Error Rate (CER) between the separated speech and the original speech using a pre-trained speech recognition model.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr