Tunisian Arabic Chat Alphabet Transliteration Using Probabilistic Finite State Transducers
Abstract: Internet is taking more and more scale in Tunisians life, especially after the revolution in 2011. Indeed, Tunisian Internet users are increasingly using social networks, blogs, etc. In this case, they favor Tunisian Arabic chat alphabet, which is a Latin-scripted Tunisian Arabic language. However, few tools were developed for Tunisian Arabic processing in this context. In this paper, we suggest developing a Tunisian Arabic chat alphabet-Tunisian Arabic transliteration machine based on weighted finite state transducers and using a Tunisian Arabic lexicon: aebWordNet (i.e., aeb is the ISO 639-3 code of Tunisian Arabic) and a Tunisian Arabic morphological analyzer. Weighted finite state transducers allow us to follow Tunisian Internet user’s transcription behavior when writing Tunisian Arabic chat alphabet texts. This last has not a standard format but respects a regular relation. Moreover, it uses aebWordNet and a Tunisian Arabic morphological analyzer to validate the generated transliterations. Our approach attempts good results compared with existing Arabic chat alphabet-Arabic transliteration tools such as EiKtub.
Keywords: Tunisian arabic chat alphabet, tunisian arabic, transliteration, aebWordNet, tunisian arabic morphological analyzer, weighted finite state transducer.