skip to main content
10.1145/1390334.1390496acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

A word shape coding method for camera-based document images

Published: 20 July 2008 Publication History

Abstract

This paper reports a word shape coding method to facilitate retrieval of camera-based document images without OCR. Due to perspective distortion, many reported word shape coding methods fail on camera-based images. In this paper, the problem is addressed by approximating the perspective transformation with an affine transformation, and employing an affine invariant, namely length ratio, to represent the connected components. Components in a document image are classified into a few clusters, each of which is assigned with a representative symbol. Retrieval are based on "words" comprising of symbols. The experiment results showed that the proposed method achieved an average retrieval precision of 93.43% and recall of 94.22%.

References

[1]
D. Doermann. The indexing and retrieval of document images: A survey. Computer Vision and Image Understanding, 70(3):287--298, 1998.
[2]
C. Gope and N. Kehtarnavaz. Affine invariant comparison of point-sets using convex hulls and hausdorff distances. Pattern Recognition, 40(1):309--320, 2007.
[3]
L. Gorman. The document spectrum for page layout analysis. IEEE Trans. Pattern Analysis and Machine Intelligence, 15:1162--1173, 1993.
[4]
Y. Lu and C. L. Tan. Information retrieval in document image databases. IEEE Transactions on Knowledge and Data Engineering, 16(11):1398--1410, 2004.
[5]
C. L. Tan, W. Huang, Z. Yu, and Y. Xu. Image document text retrieval without OCR. IEEE Transaction on Pattern Analysis and Machine Intelligence, 24(6):838--844, 2002.

Index Terms

  1. A word shape coding method for camera-based document images

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
    July 2008
    934 pages
    ISBN:9781605581644
    DOI:10.1145/1390334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 July 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. document image retrieval
    2. perspective distortion

    Qualifiers

    • Poster

    Conference

    SIGIR '08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 283
      Total Downloads
    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media