Search : [ author: 김윤중 ] (1)

Analysis of Speech Emotion Database and Development of Speech Emotion Recognition System using Attention Mechanism Integrating Frame- and Utterance-level Features

Dokyung Kim, Yoonjoong Kim

http://doi.org/10.5626/JOK.2020.47.5.479

In this study, we propose a model consist of BLSTM (Bidirectional Long-Sort Term Memory) layer, Attention mechanism layer, and Deep neural network to integrate frame- and utterance-level features from speech signals model reliability analysis the labels in the speech emotional database IEMOCAP (Interactive Emotional Dyadic Motion Capture). Based on the evaluation script of the labels provided in the IEMOCAP database, a default data set, a data set with a balanced distribution of emotion classes, and a data set with improved reliability based on three or more judgments were constructed and used for performance of the proposed model using speaker independent cross validation approach. Experiment on the improved and balanced dataset achieve a maximum score of 67.23% (WA, Weighted Accuracy) and 56.70% (UA, Unweighted Accuracy) that represents an improvement of 6.47% (WA), 4.41% (UA) over the baseline dataset.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr