Share Email Print

Proceedings Paper

Simultaneous detection of vertical and horizontal text lines based on perceptual organization
Author(s): Claudie Faure; Nicole Vincent
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

A page of a document is a set of small components which are grouped by a human reader into higher level components, such as lines and text blocs. Document image analysis is aimed at detecting these components in document images. We propose the encoding of local information by considering the properties that determine perceptual grouping. Each connected component is labelled according to the location of its nearest neighbour connected component. These labelled components constitute the input of a rule-based incremental process. Vertical and horizontal text lines are detected without prior assumption on their direction. Touching characters belonging to different lines are detected early and discarded from the grouping process to avoid line merging. The tolerance for grouping components increases in the course of the process until the final decision. After each step of the grouping process, conflict resolution rules are activated. This work was motivated by the automatic detection of Figure&Caption pairs in the documents of the historical collection of the BIUM digital library (Bibliotheque InterUniversitaire Medicale). The images that were used in this study belong to this collection.

Paper Details

Date Published: 19 January 2009
PDF: 8 pages
Proc. SPIE 7247, Document Recognition and Retrieval XVI, 72470M (19 January 2009); doi: 10.1117/12.805504
Show Author Affiliations
Claudie Faure, CNRS-LTCI, TELECOM ParisTech (France)
Nicole Vincent, Univ. Paris Descartes (France)

Published in SPIE Proceedings Vol. 7247:
Document Recognition and Retrieval XVI
Kathrin Berkner; Laurence Likforman-Sulem, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?