Automated Detection of Human Emotions from Speech Using a Multi-Layer ANN Framework
Main Article Content
Abstract
Speech Emotion Recognition (SER) is an essential area of research aimed at enabling machines to detect and interpret human emotions, thereby improving human-computer interaction. This study introduces a hybrid Speech Emotion Recognition system that combines machine learning techniques with Natural Language Processing (NLP) to enhance emotion detection accuracy and robustness. This research presents an automated framework for detecting human emotions from speech signals using a multi-layer Artificial Neural Network (ANN) model. The proposed system aims to accurately recognize emotional states such as happiness, sadness, anger, fear, and neutrality by analyzing vocal features like pitch, tone, energy, and spectral characteristics. The framework incorporates a feature extraction module using Mel-Frequency Cepstral Coefficients (MFCCs) and other prosodic features, followed by a multi-layer ANN trained on a benchmark emotional speech dataset. The model demonstrates improved accuracy and robustness in real-time emotion recognition tasks compared to traditional machine learning techniques. This study contributes to the advancement of human-computer interaction by enabling emotionally intelligent systems for applications in customer service, healthcare, virtual assistants, and e-learning environments.