Speech to Text Engine for Jawi Language

Speech to Text Engine for Jawi Language

Zaini Arifah Othman1, Nor Aniza Abdullah2, Zaidi Razak3, and Mohd Yakub Mohd Yusoff4
1,2,3Faculty of Computer Science and Information Technology, University of Malaya, Malaysia
4Academy of Islamic Studies, University of Malaya, Malaysia

 
Abstract: This paper focused on the development of speech translation to special character that is Malay speech to Jawi text engine. Jawi is a unique character derived from Arabic but it is read in Malay language. There are not many research can be found on speech technology developed for Jawi and  this research would be useful to researcher who wish to venture its benefit to many related ICT applications. The use of Zero Crossing Rate (ZCR) as a robust algorithm for accurate automatic detection of speech signal syllable boundary has been discussed. The combination of LPC and ANN are used in this research to extract and classify the speech signals with backpropagation training method. This paper also, discussed on the use of Jawi Unicode in the final character tagging process to represent each of the Jawi character existed in the spoken word. As there are no standard lists of Jawi Unicode published, in this research, the existing of Jawi Unicode table produced by previous research is further investigated and enhanced in order to have better accuracy in Jawi character-phoneme representation. This list is based on the combination of Traditional Arabic and other scripts. A prototype educational learning tool was also, developed to enable school children to recognize and read Jawi text, check their pronunciation, and learn from their mistakes independently.


Keywords: Speech-to-text, jawi unicode, linear predictive coding, artificial neural network, ZCR.

  Received May 24, 2012; accepted April 11, 2013
  

Full Text

Read 2695 times Last modified on Thursday, 03 October 2013 03:44
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…