Share Email Print

Proceedings Paper

Character extraction from documents using wavelet maxima
Author(s): Wen-Liang Hwang; Fu Chang
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

The extraction of character image is an important front-end processing for optical character recognition (OCR) and other applications. This process is extremely important because the OCR applications usually extract salient features and process on them. The existence of noise not only destroys features of characters, but also introduces unwanted features. We propose a new algorithm which removes unwanted background noises from a textual image. Our algorithm is based on the observation that the magnitude of the intensity variation of character boundaries differs form that of noises at various scales of their wavelet transform. Therefore, most of the edges corresponding to the character boundaries at each scale can be extracted using a thresholding method. The internal region of characters is determined by a voting procedure, which uses the arguments of the remaining edges. The interior of recovered characters is solid containing no holes. Characters tend to become fattened, because of the smoothness being applied in the calculation of wavelet transform. To obtain a quality restoration of character image, the precise locations of characters at the original image are then estimated using a Bayesian criterion. Detailed algorithm with careful analysis of the free parameters are also conducted in this paper. The method is simple and effective. We also present some experimental results that suggest its effectiveness.

Paper Details

Date Published: 23 October 1996
PDF: 13 pages
Proc. SPIE 2825, Wavelet Applications in Signal and Image Processing IV, (23 October 1996); doi: 10.1117/12.255222
Show Author Affiliations
Wen-Liang Hwang, Institute of Information Science (Taiwan)
Fu Chang, Institute of Information Science (Taiwan)

Published in SPIE Proceedings Vol. 2825:
Wavelet Applications in Signal and Image Processing IV
Michael A. Unser; Akram Aldroubi; Andrew F. Laine, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?