A Fast High Precision Skew Angle Estimation of
Digitized Documents
Merouane Chettat, Djamel Gaceb, and Soumia Belhadi
Laboratory of Computer
Science, Modeling, Optimization and Electronic Systems, Faculty of science,
University M’hamed Bougara Boumerdes, Algeria
Abstract: In
this paper, we treated the problem of automatic skew angle estimation of
scanned documents. The skew of document occurs very often, due to incorrect
positioning of the documents or a manipulation error during scanning. This has
negative consequences on the steps of automatic analysis and recognition of
text. It is therefore essential to verify, before proceeding to these steps,
the presence of skew on the document to be processed and to correct it. The
difficulty of this verification is associated to the presence of graphic zones,
sometimes dominant, that have a considerable impact on the accuracy of the text
skew angle estimation. We also noted the importance of preprocessing to improve
the accuracy and the calculation cost of skew estimation approaches. These two
elements have been taken into consideration in our design and development of a
new approach of skew angle estimation and correction. Our approach is based on
local binarization followed by horizontal smoothing by the Run Length Smoothing
Algorithm (RLSA) method, detection of horizontal contours and the Hierarchical Hough
Transform (HHT). The algorithms involved in our approach have been chosen to
guarantee a skew estimation: accurate, fast and robust, especially to graphic
dominance and real time application. The experimental tests show the
effectiveness of our approach on a representative database of the Document
Image Skew Estimation Contest (DISEC) contest International Conference on
Document Analysis and Recognition (ICDAR).
Keywords: Skew angle estimation, document images,
Hough transform, Binarization, edge detection, RLSA.
Received September 13, 2017; accepted December
24, 2018