Tunisian Arabic Chat Alphabet Transliteration Using Probabilistic Finite State Transducers

Tunisian Arabic Chat Alphabet Transliteration Using Probabilistic Finite State Transducers

Nadia Karmani, Hsan Soussou, and Adel Alimi
Research Groups on Intelligent Machines, University of Sfax, Tunisia

Abstract: Internet is taking more and more scale in Tunisians life, especially after the revolution in 2011. Indeed, Tunisian Internet users are increasingly using social networks, blogs, etc. In this case, they favor Tunisian Arabic chat alphabet, which is a Latin-scripted Tunisian Arabic language. However, few tools were developed for Tunisian Arabic processing in this context. In this paper, we suggest developing a Tunisian Arabic chat alphabet-Tunisian Arabic transliteration machine based on weighted finite state transducers and using a Tunisian Arabic lexicon: aebWordNet (i.e., aeb is the ISO 639-3 code of Tunisian Arabic) and a Tunisian Arabic morphological analyzer. Weighted finite state transducers allow us to follow Tunisian Internet user’s transcription behavior when writing Tunisian Arabic chat alphabet texts. This last has not a standard format but respects a regular relation. Moreover, it uses aebWordNet and a Tunisian Arabic morphological analyzer to validate the generated transliterations. Our approach attempts good results compared with existing Arabic chat alphabet-Arabic transliteration tools such as EiKtub.

Keywords: Tunisian arabic chat alphabet, tunisian arabic, transliteration, aebWordNet, tunisian arabic morphological analyzer, weighted finite state transducer.

Received August 6, 2015; accepted April 17, 2016
Read 1530 times Last modified on Sunday, 24 February 2019 06:46
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…