Tracking Morphophonemic Transformation in Arabic Word Generation and Root Extraction
Sane Yagi1 and Jim Yaghi2
1Department of Linguistics & Phonetics, University of Jordan, Amman, Jordan
2Computer Science Department, MacQuire University, Sydney, Australia
Abstract: Performing root-based searching, concordancing, and grammar checking in Arabic requires an efficient method for matching stems with roots and vice versa. Such mapping is complicated by the hundreds of manifestations of the same root; the radicals often undergo replacement, fusion, inversion, and/or deletion. It is a challenge, therefore, to keep track of original radicals. An algorithm based on methods used by native speakers is proposed here to track root radicals in the generation process and the subsequent reversal process of root extraction. Verb roots are classified by the types of their radicals and the stems they generate. Roots are molded with morphosemantic and morphosyntactic patterns to generate stems modified for tense, voice, and mode, affixed for different subject number, gender, and person. The surface forms of applicable morphophonemic transformation are then derived using finite state machines. This paper defines what is meant by `stem', describes a stem generation engine that the authors developed, and outlines how a generated stem database is compiled for all Arabic verbs.
Keywords: Arabic, morphology, generation, extraction, root, finite state.