Recognition of Spoken Bengali Numerals Using MLP, SVM,
RF Based Models with PCA Based Feature Summarization
Avisek Gupta and Kamal
Sarkar
Department of Computer Science and Engineering, Jadavpur
University, India
Abstract: This paper presents a method of automatic recognition of Bengali
numerals spoken in noise-free and noisy environments by multiple speakers with
different dialects. Mel Frequency Cepstral Coefficients (MFCC) are used for
feature extraction, and Principal Component Analysis is used as a feature
summarizer to form the feature vector from the MFCC data for each digit
utterance. Finally, we use Support Vector Machines, Multi-Layer Perceptrons,
and Random Forests to recognize the Bengali digits and compare their
performance. In our approach, we treat each digit utterance as a single
indivisible entity, and we attempt to recognize it using features of the digit
utterance as a whole. This approach can therefore be easily applied to spoken
digit recognition tasks for other languages as well.
Keywords: Speech recognition, isolated digits,
principal component analysis, support vector machines, multi-layered
perceptrons, random forests.
|