Digital Library[ Search Result ]
ConTL: Improving the Performance of EEG-based Emotion Recognition via the Incorporation of CNN, Transformer and LSTM
Hyunwook Kang, Byung Hyung Kim
http://doi.org/10.5626/JOK.2024.51.5.454
This paper proposes a hybrid-network called ConTL, which is composed of a convolutional neural network (CNN), Transformer, and long short-term memory (LSTM) for EEG-based emotion recognition. Firstly, CNN is exploited to learn local features from the input EEG signals. Then, the Transformer learns global temporal dependencies from the output features. To further learn sequential dependencies of the time domain, the output features from the Transformer are fed to the bi-directional LSTM. To verify the effects of the proposed model, we compared the classification accuracies with five state-of-the-art models. There was an 0.73% improvement on SEED-IV compared to CCNN, and improvements of 0.97% and 0.63% were observed compared to DGCNN for valence and arousal of DEAP, respectively.
RNN model for Emotion Recognition in Dialogue by incorporating the Attention on the Other’s State
http://doi.org/10.5626/JOK.2021.48.7.802
Emotion recognition has increasingly received much attention in artificial intelligence, lately. In this paper, we present an RNN model that analyzes and identifies a speaker’s emotion appeared through utterances in conversation. There are two kinds of speaker considered context, self-dependency and inter-speaker dependency. In particular, we focus more on inter-speaker dependency by considering that the state context information of the relative speaker can affect the emotions of the current speaker. We propose a DialogueRNN based model that adds a new GRU Cell for catching inter-speaker dependency. Our model shows higher performance than the performances of DialogueRNN and its three variants on multiple emotion classification datasets.
Facial Emotion Recognition Data Augmentation using Generative Adversarial Network
http://doi.org/10.5626/JOK.2021.48.4.398
The facial emotion recognition field of computer vision has recently been identified to demonstrate meaningful results through various neural networks. However, the major datasets of facial emotion recognition have the problem of “class imbalance,” which is a factor that degrades the accuracy of deep learning models. Therefore, numerous studies have been actively conducted to solve the problem of class imbalance. In this paper, we propose “RDGAN,” a facial emotion recognition data augmentation model that uses a GAN to solve the class imbalance of the FER2013 and RAF_single that are used as facial emotion recognition datasets. RDGAN is a network that generates images suitable for classes by adding expression discriminators based on the image-to-image translation model between the existing images as compared to the prevailing studies. The dataset augmented with RDGAN showed an average performance improvement of 4.805%p and 0.857%p in FER2013 and RAF_single, respectively, compared to the dataset without data augmentation.
CNN-based Speech Emotion Recognition Model Applying Transfer Learning and Attention Mechanism
Jung Hyun Lee, Ui Nyoung Yoon, Geun-Sik Jo
http://doi.org/10.5626/JOK.2020.47.7.665
Existing speech-based emotion recognition studies can be classified into the case of using a voice feature value and a variety of voice feature values. In the case of using a voice feature value, there is a problem that it is difficult to reflect the complex factors of the voice such as loudness, overtone structure, and range of voices. In the case of using various voice feature values, studies based on machine learning comprise a large number, and there is a disadvantage in that emotion recognition accuracy is relatively lower than that of deep learning-based studies. To resolve this problem, we propose a speech emotion recognition model based on a CNN(Convolutional Neural Network) using Mel-Spectrogram and Mel Frequency Cepstral Coefficient (MFCC) as voice feature values. The proposed model applied transfer learning and attention to improve learning speed and accuracy, and achieved 77.65% emotion recognition accuracy, showing higher performance than the comparison works.
Analysis of Speech Emotion Database and Development of Speech Emotion Recognition System using Attention Mechanism Integrating Frame- and Utterance-level Features
http://doi.org/10.5626/JOK.2020.47.5.479
In this study, we propose a model consist of BLSTM (Bidirectional Long-Sort Term Memory) layer, Attention mechanism layer, and Deep neural network to integrate frame- and utterance-level features from speech signals model reliability analysis the labels in the speech emotional database IEMOCAP (Interactive Emotional Dyadic Motion Capture). Based on the evaluation script of the labels provided in the IEMOCAP database, a default data set, a data set with a balanced distribution of emotion classes, and a data set with improved reliability based on three or more judgments were constructed and used for performance of the proposed model using speaker independent cross validation approach. Experiment on the improved and balanced dataset achieve a maximum score of 67.23% (WA, Weighted Accuracy) and 56.70% (UA, Unweighted Accuracy) that represents an improvement of 6.47% (WA), 4.41% (UA) over the baseline dataset.
A Knowledge Graph Embedding-based Ensemble Model for Link Prediction
http://doi.org/10.5626/JOK.2020.47.5.473
Knowledge bases often suffer from their limited applicability due to missing information in their entities and relations. Link prediction has been investigated to complete the missing information and makes a knowledge base more useful. The existing studies on link prediction often rely on knowledge graph embedding and have shown trade-off in their performance. In this paper, we propose an ensemble model for knowledge graph embedding to improve quality of link prediction. The proposed model combines multiple knowledge graph embeddings that have unique characteristics. In this way, the ensemble model is able to consider various aspects of the entries within a knowledge base and reduce the variation of accuracy depending on hyper-parameters. Our experiment shows that the proposed model outperforms other knowledge graph embedding methods by 13.5% on WN18 and FB15K dataset.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr