A Data Base for Arabic Handwritten Text Recognition Research
Somaya Al-Ma’adeed, Dave Elliman, and Colin Higgins
School of Computer Science and Information Technology, University of Nottingham, UK
Abstract: In this paper we present a new database for off-line Arabic handwriting recognition, together with several preprocessing procedures. We designed, collected and stored a database of Arabic handwriting (AHDB). This resulted in a unique databases dealing with handwritten information from Arabic text, both in terms of the size of the database as well as the number of different writers involved. We further designed an innovative, simple, yet powerful, in place tagging procedure for the database. It enables us to extract at will the bitmaps of words. We also built a preprocessing class, which contains some useful preprocessing operations. In this paper, the most popular words in Arabic writing were found for the first time using a specially designed program.
Keywords: Arabic, handwriting, recognition, database, preprocessing, cursive script.
Received April 15, 2003; accepted July 18, 2003