Share Email Print

Proceedings Paper

Retrieval of historical documents by word spotting
Author(s): Nikoleta Doulgeri; Ergina Kavallieratou
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

The implementation of word spotting is not an easy procedure and it gets even worse in the case of historical documents since it requires character recognition and indexing of the document images. A general technique for word spotting is presented, independent of OCR, using automatic representation of the text queries of the user by word images and comparing them with the word images extracted from the document images. The proposed system does not require training. The only required preprocessing task is the alphabet determination. Global shape features are used to describe the words. They are very general in order to capture the form of the word and appropriately normalized in order to face the usual problems of variance in resolution, width of words and fonts. A novel technique that makes use of the interpolation method is presented. In our experiments, we analyze the system dependence on its parameters and we prove that its performance is similar to the trainable systems.

Paper Details

Date Published: 19 January 2009
PDF: 10 pages
Proc. SPIE 7247, Document Recognition and Retrieval XVI, 724706 (19 January 2009); doi: 10.1117/12.805602
Show Author Affiliations
Nikoleta Doulgeri, Univ. of the Aegean (Greece)
Ergina Kavallieratou, Univ. of the Aegean (Greece)

Published in SPIE Proceedings Vol. 7247:
Document Recognition and Retrieval XVI
Kathrin Berkner; Laurence Likforman-Sulem, Editor(s)

© SPIE. Terms of Use
Back to Top