Recognition of Spoken Arabic Digits Using Neural Predictive Hidden Markov Models
Rafik Djemili, Mouldi Bedda, and Hocine Bourouba
Automatic and Signals Laboratory of Annaba, Badji Mokhtar University, Algeria
Abstract: In this study, we propose an algorithm for Arabic isolated digit recognition. The algorithm is based on extracting acoustical features from the speech signal and using them as input to multi-layer perceptrons neural networks. Each word in the vocabulary digits (0 to 9) is associated with a network. The networks are implemented as predictors for the speech samples for certain duration of time. The back-propagation algorithm is used to train the networks. The hidden markov model (HMM) is implemented to extract temporal features (states) for the speech signal. The input vector to the networks consists of twelve mel frequency cepstral coefficients, log of the energy, and five elements representing the state. Our results show that we are able to reduce the word error rate comparing with an HMM word recognition system.
Keywords: Speech recognition, hidden Markov models, artificial neural networks, hybrid HMM/MLP.
Received September 15, 2003; accepted January 19, 2004