skip to main content
10.1145/1815330.1815349acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdasConference Proceedingsconference-collections
research-article

Text extraction from graphical document images using sparse representation

Published:09 June 2010Publication History

ABSTRACT

A novel text extraction method from graphical document images is presented in this paper. Graphical document images containing text and graphics components are considered as two-dimensional signals by which text and graphics have different morphological characteristics. The proposed algorithm relies upon a sparse representation framework with two appropriately chosen discriminative overcomplete dictionaries, each one gives sparse representation over one type of signal and non-sparse representation over the other. Separation of text and graphics components is obtained by promoting sparse representation of input images in these two dictionaries. Some heuristic rules are used for grouping text components into text strings in post-processing steps. The proposed method overcomes the problem of touching between text and graphics. Preliminary experiments show some promising results on different types of document.

References

  1. M. Aharon, M. Elad, and A. Bruckstein. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. on Signal Processing, 54(11):4311--4322, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. Barlow. Unsupervised learning. Neural Computation, 1(3):295--311, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. J. Candès and D. L. Donoho. New tight frames of curvelets and optimal representations of objects with piecewise C 2 singularities. Comm. Pure Appl. Math., 57(2):219--266, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  4. R. Cao and C. L. Tan. Text/graphics separation in maps. In D. Blostein and Y.-B. Kwon, editors, GREC, volume 2390 of LNCS, pages 167--177. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing, 20(1):33--61, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Davis, S. Mallat, and M. Avellaneda. Adaptive greedy approximations. J. Constr. Approx., 13(1):57--98, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  7. O. Deforges and D. Barba. A robust and multiscale document image segmentation for block line/text line structures extraction. In Proceedings of the 12th ICPR, volume 2, pages 306--310, 1994.Google ScholarGoogle Scholar
  8. D. L. Donoho. For most large underdetermined systems of linear equations the minimal l 1-norm solution is also the sparsest solution. Comm. Pure Appl. Math., 59(7):797--829, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  9. D. L. Donoho and I. M. Johnstone. Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81(3):425--455, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Elad, J. Starck, P. Querre, and D. Donoho. Simultaneous cartoon and texture image inpainting using morphological component analysis (MCA). Appl. Comput. Harmon. Anal., 19(3):340--358, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. Fadili, J.-L. Starck, and F. Murtagh. Inpainting and zooming using sparse representations. Computer Journal, 52(1):64--79, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. A. Fletcher and R. Kasturi. A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Anal. Mach. Intell., 10(6):910--918, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Freeman and R. Shapira. Determining the minimum-area encasing rectangle for an arbitrary closed curve. Communications of the ACM, 18(7):409--413, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Gloger. Use of the Hough transform to separate merged text/graphics in forms. In Proceedings of the 11th ICPR, volume 1, pages 268--271, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  15. R. C. Gonzalez and R. E. Woods. Digital Image Processing. Prentice Hall, 2nd edition, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. V. Hoang, S. Tabbone, and N.-Y. Pham. Extraction of Nom text regions from stele images using area Voronoi diagram. In Proceedings of the 10th ICDAR, pages 921--925, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Jutten and J. Herault. Blind separation of sources, part 1: an adaptive algorithm based on neuromimetic architecture. Signal Process., 24(1):1--10, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. P. Lai and R. Kasturi. Detection of dimension sets in engineering drawings. IEEE Trans. Pattern Anal. Mach. Intell., 16(8):848--855, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. X. Le, G. R. Thoma, and H. Wechsler. Classification of binary document images into textual or nontextual data blocks using neural network models. Machine Vision and Applications, 8(5):289--304, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Z. Lu. Detection of text regions from digital engineering drawings. IEEE Trans. Pattern Anal. Mach. Intell., 20(4):431--439, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Luo and R. Kasturi. Improved directional morphological operations for separation of characters from maps/graphics. In K. Tombre and A. K. Chhabra, editors, GREC, volume 1389 of LNCS, pages 35--47. Springer, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607--609, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  23. B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Research, 37(23):3311--3325, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  24. W. Pan, T. Bui, and C. Suen. Text segmentation from complex background using sparse representations. In Proceedings of the 9th ICDAR, pages 412--416, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Sardy, A. G. Bruce, and P. Tseng. Block coordinate relaxation methods for nonparametric wavelet denoising. J. of Comput. and Graph. Stat., 9(2):361--379, 2000.Google ScholarGoogle Scholar
  26. M. Shensa. The discrete wavelet transform: wedding the à trous and Mallat algorithms. IEEE Trans. on Signal Processing, 40(10):2464--2482, 1992.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Song, F. Su, C.-L. Tai, and S. Cai. An object-oriented progressive-simplification-based vectorization system for engineering drawings: model, algorithm, and performance. IEEE Trans. Pattern Anal. Mach. Intell., 24(8):1048--1060, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J.-L. Starck, M. Elad, and D. L. Donoho. Image decomposition via the combination of sparse representations and a variational approach. IEEE Trans. on Image Processing, 14(10):1570--1582, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. F. Su, T. Lu, R. Yang, S. Cai, and Y. Yang. A character segmentation method for engineering drawings based on holistic and contextual constraints. In Proceedings of the 8th GREC, pages 280--287, 2009.Google ScholarGoogle Scholar
  30. S. Tabbone, L. Wendling, and J.-P. Salmon. A new shape descriptor defined on the Radon transform. Comput. Vis. Image Underst., 102(1):42--51, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. L. Tan and P. O. Ng. Text extraction using pyramid. Pattern Recognition, 31(1):63--72, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  32. K. Tombre, S. Tabbone, L. Pélissier, B. Lamiroy, and P. Dosch. Text/graphics separation revisited. In D. P. Lopresti, J. Hu, and R. S. Kashi, editors, DAS, volume 2423 of LNCS, pages 200--211. Springer, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Wahl, K. Wong, and R. Casey. Block segmentation and text extraction in mixed text/image documents. Computer Graphics and Image Processing, 20(4):375--390, 1982.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Text extraction from graphical document images using sparse representation

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
            June 2010
            490 pages
            ISBN:9781605587738
            DOI:10.1145/1815330

            Copyright © 2010 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 9 June 2010

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader