Evaluation of Influence of Arousal-Valence Primitives on Speech Emotion Recognition
Imen Trabelsi1, Dorra Ben Ayed2, and Noureddine Ellouze2
1Sciences and Technologies of Image and Telecommunications, Sfax University, Tunisia
2Ecole Nationale d’Ingénieurs de Tunis, Université Tunis-Manar, Tunisia
Abstract: Speech Emotion recognition is a challenging research problem with a significant scientific interest. There has been a lot of research and development around this field in the recent times. In this article, we present a study which aims to improve the recognition accuracy of speech emotion recognition using a hierarchical method based on Gaussian Mixture Model and Support Vector Machines for dimensional and continuous prediction of emotions in valence (positive vs negative emotion) and arousal space (the degree of emotional intensity). According to these dimensions, emotions are categorized into N broad groups. These N groups are further classified into other groups using spectral representation. We verify and compare the functionality of the different proposed multi-level models in order to study differential effects of emotional valence and arousal on the recognition of a basic emotion. Experimental studies are performed over the Berlin Emotional database and the Surrey Audio-Visual Expressed Emotion corpus, expressing different emotions, in German and English languages.
Keywords: Speech emotion recognition, arousal, valence, hierarchical classification,