A New Mixed Binarization Method Used in A Real Time Application of Automatic Business Document and Postal Mail Sorting
Djamel Gaceb, Véronique Eglin and Frank Lebourgeois
LIRIS laboratory, National Institut of Applied Science (INSA of Lyon), France
LIRIS laboratory, National Institut of Applied Science (INSA of Lyon), France
Abstract: The binarization is applied in the first stage of segmentation process and has a very strong impact on the performances of the system of the automatic sorting of company documents and mail. We present in the beginning of this paper a complete study of the different existing binarization mechanisms that are developed to meet the needs of specific applications. These conventional approaches, present weaknesses that it is crucial to overcome and unfortunately they remain unsuitable for our real time application. The separation between the thresholding and the text zones location stages considerably increase the computation time and lead to an over-segmentation of the noise and of the paper texture on empty zones of the image. Indeed, none of the traditional methods (whether global or local) efficiently meets all the required conditions. We have managed to optimize this stage by applying a local threshold only near the text zones that can be located by the cumulated gradients method with the multi-resolution and mathematical morphology. We demonstrate the consistent performance of the proposed method on several types of business documents and mail with wide-ranging content and image quality.
Keywords: Binarization, text zones location, real time processing, automatic sorting of company documents and mail.
Received November 9, 2010; accepted May 24, 2011