Dynamic Random Forest for the Recognition of Arabic Handwritten Mathematical Symbols with A Novel Set of Features
Ibtissem Ali and Mohamed Mahjoub
Laboratory of Advanced Technology and Intelligent Systems, University of Sousse, Tunisia
Abstract: Mathematics has a number of characteristics which distinguish it from conventional text and make it a challenging area for recognition. This include principally its two dimensional structure and the diversity of used symbols, especially in Arabic context. Recognition of mathematical formulas requires solving three sub problems: segmentation, the symbol recognition and finally the symbol arrangement analysis. In this paper we will focus on the Arabic mathematical symbol recognition step. This is a challenging task due to the large symbol set with many similar looking symbols used in Arabic mathematics and also the great variability found in human writing. The strength of the selected features and the effectiveness of the classifier are the two key factors determining the performance of a handwritten symbols recognition System .In this paper we proposed a novel Shape Context (SH) descriptor and explored its combination with a modified Chain Code Histogram (CCH) and a Histogram of Oriented Gradient (HOG) at the level of descriptors extraction. For the classification we used a Dynamic Random Forest (DRF) model which has the advantage of efficiently modelling the interaction among trees to determine the right prediction. The results carried out Handwritten Arabic Mathematical Dataset (HAMF) show that the DRF proves a significant improvement in terms of accuracy compared to the standard static RF and Support Vector Machines (SVM).
Keywords: Arabic handwritten mathematical symbols recognition, SH, HOG, CCH, dynamic RF.
Received February 20, 2018, accepted April 17, 2018