Share Email Print

Proceedings Paper

Simultaneous segmentation and recognition of Arabic printed text using linguistic concepts of vocabulary
Author(s): Mohamed Ben Halima; Adel M. Alimi
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

In this paper, we propose a new approach to Arabic printed text analysis and recognition. This approach is based on linguistic concepts of Arabic vocabulary. For the text, we allow to categorize the words in decomposable words (derived from a root) and indecomposable words (not derived from a root) and to put forth morpho-syntactic characterization hypotheses for each word. For the decomposable words, we attempt to recognize word basic morphemes: antefix, prefix, infix, suffix, postfix and root contrary to existing approaches which are usually based on recognition of word entity by holistic approach.

Paper Details

Date Published: 19 January 2009
PDF: 10 pages
Proc. SPIE 7247, Document Recognition and Retrieval XVI, 72470T (19 January 2009); doi: 10.1117/12.805617
Show Author Affiliations
Mohamed Ben Halima, The High School of National Engineering of Sfax (Tunisia)
Adel M. Alimi, The High School of National Engineering of Sfax (Tunisia)

Published in SPIE Proceedings Vol. 7247:
Document Recognition and Retrieval XVI
Kathrin Berkner; Laurence Likforman-Sulem, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?