research-article

Text extraction from graphical document images using sparse representation

Authors:
Thai V. Hoang

Hanoi University of Technology, Hanoi, Vietnam

Hanoi University of Technology, Hanoi, Vietnam
View Profile

,
Salvatore Tabbone

Hanoi University of Technology, Hanoi, Vietnam

Hanoi University of Technology, Hanoi, Vietnam
View Profile

DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis SystemsJune 2010Pages 143–150https://doi.org/10.1145/1815330.1815349

Published:09 June 2010Publication History

DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems

Pages 143–150

ABSTRACT

A novel text extraction method from graphical document images is presented in this paper. Graphical document images containing text and graphics components are considered as two-dimensional signals by which text and graphics have different morphological characteristics. The proposed algorithm relies upon a sparse representation framework with two appropriately chosen discriminative overcomplete dictionaries, each one gives sparse representation over one type of signal and non-sparse representation over the other. Separation of text and graphics components is obtained by promoting sparse representation of input images in these two dictionaries. Some heuristic rules are used for grouping text components into text strings in post-processing steps. The proposed method overcomes the problem of touching between text and graphics. Preliminary experiments show some promising results on different types of document.

References

M. Aharon, M. Elad, and A. Bruckstein. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. on Signal Processing, 54(11):4311--4322, 2006. Google ScholarDigital Library
H. Barlow. Unsupervised learning. Neural Computation, 1(3):295--311, 1989. Google ScholarDigital Library
E. J. Candès and D. L. Donoho. New tight frames of curvelets and optimal representations of objects with piecewise C ² singularities. Comm. Pure Appl. Math., 57(2):219--266, 2002.Google ScholarCross Ref
R. Cao and C. L. Tan. Text/graphics separation in maps. In D. Blostein and Y.-B. Kwon, editors, GREC, volume 2390 of LNCS, pages 167--177. Springer, 2001. Google ScholarDigital Library
S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing, 20(1):33--61, 1998. Google ScholarDigital Library
G. Davis, S. Mallat, and M. Avellaneda. Adaptive greedy approximations. J. Constr. Approx., 13(1):57--98, 1997.Google ScholarCross Ref
O. Deforges and D. Barba. A robust and multiscale document image segmentation for block line/text line structures extraction. In Proceedings of the 12th ICPR, volume 2, pages 306--310, 1994.Google Scholar
D. L. Donoho. For most large underdetermined systems of linear equations the minimal l ₁-norm solution is also the sparsest solution. Comm. Pure Appl. Math., 59(7):797--829, 2006.Google ScholarCross Ref
D. L. Donoho and I. M. Johnstone. Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81(3):425--455, 1994.Google ScholarCross Ref
M. Elad, J. Starck, P. Querre, and D. Donoho. Simultaneous cartoon and texture image inpainting using morphological component analysis (MCA). Appl. Comput. Harmon. Anal., 19(3):340--358, 2005.Google ScholarCross Ref
M. Fadili, J.-L. Starck, and F. Murtagh. Inpainting and zooming using sparse representations. Computer Journal, 52(1):64--79, 2009. Google ScholarDigital Library
L. A. Fletcher and R. Kasturi. A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Anal. Mach. Intell., 10(6):910--918, 1988. Google ScholarDigital Library
H. Freeman and R. Shapira. Determining the minimum-area encasing rectangle for an arbitrary closed curve. Communications of the ACM, 18(7):409--413, 1975. Google ScholarDigital Library
J. Gloger. Use of the Hough transform to separate merged text/graphics in forms. In Proceedings of the 11th ICPR, volume 1, pages 268--271, 1992.Google ScholarCross Ref
R. C. Gonzalez and R. E. Woods. Digital Image Processing. Prentice Hall, 2nd edition, 2001. Google ScholarDigital Library
T. V. Hoang, S. Tabbone, and N.-Y. Pham. Extraction of Nom text regions from stele images using area Voronoi diagram. In Proceedings of the 10th ICDAR, pages 921--925, 2009. Google ScholarDigital Library
C. Jutten and J. Herault. Blind separation of sources, part 1: an adaptive algorithm based on neuromimetic architecture. Signal Process., 24(1):1--10, 1991. Google ScholarDigital Library
C. P. Lai and R. Kasturi. Detection of dimension sets in engineering drawings. IEEE Trans. Pattern Anal. Mach. Intell., 16(8):848--855, 1994. Google ScholarDigital Library
D. X. Le, G. R. Thoma, and H. Wechsler. Classification of binary document images into textual or nontextual data blocks using neural network models. Machine Vision and Applications, 8(5):289--304, 1995. Google ScholarDigital Library
Z. Lu. Detection of text regions from digital engineering drawings. IEEE Trans. Pattern Anal. Mach. Intell., 20(4):431--439, 1998. Google ScholarDigital Library
H. Luo and R. Kasturi. Improved directional morphological operations for separation of characters from maps/graphics. In K. Tombre and A. K. Chhabra, editors, GREC, volume 1389 of LNCS, pages 35--47. Springer, 1997. Google ScholarDigital Library
B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607--609, 1996.Google ScholarCross Ref
B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Research, 37(23):3311--3325, 1998.Google ScholarCross Ref
W. Pan, T. Bui, and C. Suen. Text segmentation from complex background using sparse representations. In Proceedings of the 9th ICDAR, pages 412--416, 2007. Google ScholarDigital Library
S. Sardy, A. G. Bruce, and P. Tseng. Block coordinate relaxation methods for nonparametric wavelet denoising. J. of Comput. and Graph. Stat., 9(2):361--379, 2000.Google Scholar
M. Shensa. The discrete wavelet transform: wedding the à trous and Mallat algorithms. IEEE Trans. on Signal Processing, 40(10):2464--2482, 1992.Google ScholarDigital Library
J. Song, F. Su, C.-L. Tai, and S. Cai. An object-oriented progressive-simplification-based vectorization system for engineering drawings: model, algorithm, and performance. IEEE Trans. Pattern Anal. Mach. Intell., 24(8):1048--1060, 2002. Google ScholarDigital Library
J.-L. Starck, M. Elad, and D. L. Donoho. Image decomposition via the combination of sparse representations and a variational approach. IEEE Trans. on Image Processing, 14(10):1570--1582, 2005. Google ScholarDigital Library
F. Su, T. Lu, R. Yang, S. Cai, and Y. Yang. A character segmentation method for engineering drawings based on holistic and contextual constraints. In Proceedings of the 8th GREC, pages 280--287, 2009.Google Scholar
S. Tabbone, L. Wendling, and J.-P. Salmon. A new shape descriptor defined on the Radon transform. Comput. Vis. Image Underst., 102(1):42--51, 2006. Google ScholarDigital Library
C. L. Tan and P. O. Ng. Text extraction using pyramid. Pattern Recognition, 31(1):63--72, 1998.Google ScholarCross Ref
K. Tombre, S. Tabbone, L. Pélissier, B. Lamiroy, and P. Dosch. Text/graphics separation revisited. In D. P. Lopresti, J. Hu, and R. S. Kashi, editors, DAS, volume 2423 of LNCS, pages 200--211. Springer, 2002. Google ScholarDigital Library
F. Wahl, K. Wong, and R. Casey. Block segmentation and text extraction in mixed text/image documents. Computer Graphics and Image Processing, 20(4):375--390, 1982.Google ScholarCross Ref

Index Terms

Text extraction from graphical document images using sparse representation
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Document analysis
      2. Graphics recognition and interpretation
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
        Video segmentation

Recommendations

Sparse representation with morphologic regularizations for single image super-resolution

Due to the fact that natural images are inherently sparse in some domains, sparse representation has led to interesting results in image acquiring, representing, and compressing high-dimensional signals. Based on the experiences and learned priors in ...
Read More
An antinoise sparse representation method for robust face recognition via joint l1 and l2 regularization

L1 or L2 regularization based representation is not antinoise enough.An antinoise sparse representation via joint L1 and L2 is proposed.The rationale of the objective function for fusion is analyzed.Recognition of noisy samples is evaluated as true ...
Read More
Image Denoising Using Low-Rank Dictionary and Sparse Representation
CIS '14: Proceedings of the 2014 Tenth International Conference on Computational Intelligence and Security

In this paper, we propose an image denoising model by using low-rank dictionary and sparse representation (LRSR). The K-SVD algorithm learns a universal dictionary for all patches in an image and the NLM exploits similarities of nonlocal patches, both ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
June 2010
490 pages
ISBN:9781605587738
DOI:10.1145/1815330
General Chairs:
David Doermann
University of Maryland, College Park
,
Venu Govindaraju
University at Buffalo, SUNY
,
Daniel Lopresti
Lehigh University
,
Prem Natarajan
Raytheon BBN Technologies
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 June 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
curvelet transform
morphological component analysis
redundant wavelet transform
sparse representation
text component grouping
text/graphics separation
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 416
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Text extraction from graphical document images using sparse representation

DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sparse representation with morphologic regularizations for single image super-resolution

An antinoise sparse representation method for robust face recognition via joint l1 and l2 regularization

Image Denoising Using Low-Rank Dictionary and Sparse Representation