A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

Riaz Ahmad1, Saeeda Naz2, Muhammad Afzal3, Sheikh Rashid4, Marcus Liwicki5, and Andreas Dengel6

1Shaheed Banazir Bhutto University, Sheringal, Pakistan

2Computer Science Department, GGPGC No.1 Abbottabad, Pakistan 

3Mindgarage, University of Kaiserslautern, Germany 

 4Al Khwarizmi Institute of Computer Science, UET Lahore, Pakistan

5Department of Computer Science, Luleå University of Technology, Luleå

6German Research Center for Artificial Intelligence (DFKI) in Kaiserslautern, Germany

Abstract: This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Keywords: Handwritten Arabic text recognition, deep learning, data augmentation.

Received July 11, 2017; accepted April 25, 2018
https://doi.org/10.34028/iajit/17/3/3
 
Read 3398 times Last modified on Thursday, 30 April 2020 10:11
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…