Text Extraction from Document Images using Edge Information

Please use this identifier to cite or link to this item: http://hdl.handle.net/2080/1236

Title:	Text Extraction from Document Images using Edge Information
Authors:	Grover, S Arora, K Mitra, S K
Issue Date:	2009
Publisher:	IEEE
Citation:	IEEE India Council Conference, INDICON 2009; Ahmedabad; 18 December 2009 through 20 December 2009; Category number CFP09598; Code 79708; Article number 5409409
Abstract:	Detection of text from documents in which text is embedded in complex colored document images is a very challenging problem. There are a lot of potential uses of text extraction in image searching, archiving documents etc. In this paper, we propose a simple edge based feature to perform this task. It aims at detecting textual regions from the document and separating it from the graphics portion. The algorithm is based on the sharp edges of the characters which are missing in images. We find these edges and use them to classify text from images. This edge information can also be used for other image interpretation tasks.
URI:	http://dx.doi.org/10.1109/INDCON.2009.5409409 http://hdl.handle.net/2080/1236
Appears in Collections:	Conference Papers

Files in This Item:

File	Description	Size	Format
grover.pdf		632.29 kB	Adobe PDF	View/Open