|
ABSTRACT
In this paper, we present (a) a method for identifying documents captured from low-resolution devices such as web-cams, digital cameras or mobile phones and (b) a technique for extracting their textual content without performing OCR. The first method associates a hierarchically structured visual signature to the low-resolution document image and further matches it with the visual signatures of the original high-resolution document images, stored in PDF form in a repository. The matching algorithm follows the signature hierarchy, which speeds-up the search by guiding it towards fruitful solution spaces. In a second step, the content of the original PDF document is extracted, structured, and matched with its corresponding high-resolution visual signature. Finally, the matched content is attached to the low-resolution document image's visual signature, which greatly enriches the document's content and indexing. We present in this article both these identification and extraction methods and evaluate them on various documents, resolutions and lighting conditions, using different capture devices.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Gregory D. Abowd , Christopher G. Atkeson , Ami Feinstein , Cindy Hmelo , Rob Kooper , Sue Long , Nitin Sawhney , Mikiya Tani, Teaching and learning as multimedia authoring: the classroom 2000 project, Proceedings of the fourth ACM international conference on Multimedia, p.187-198, November 18-22, 1996, Boston, Massachusetts, United States
[doi> 10.1145/244130.244191]
|
| |
2
|
Behera, A., Lalanne, D., and Ingold, R. Looking at projected documents: Event detection & document identification, Intl. Conf. on Multimedia Expo (ICME '04), 2004.
|
| |
3
|
Cattoni, R., Coianiz, T., Messelodi, S., and Modena C. M. Geometric Layout Analysis Techniques for Document Image Understanding a review: Technical Report, ITC-IRST, Trento, Italy 1998.
|
 |
4
|
Patrick Chiu , Jonathan Foote , Andreas Girgensohn , John Boreczky, Automatically linking multimedia meeting documents by image matching, Proceedings of the eleventh ACM on Hypertext and hypermedia, p.244-245, May 30-June 03, 2000, San Antonio, Texas, United States
[doi> 10.1145/336296.336403]
|
| |
5
|
|
 |
6
|
David Franklin , Shannon Bradshaw , Kristian Hammond, Jabberwocky: you don't have to be a rocket scientist to change slides for a hydrogen combustion lecture, Proceedings of the 5th international conference on Intelligent user interfaces, p.98-105, January 09-12, 2000, New Orleans, Louisiana, United States
[doi> 10.1145/325737.325792]
|
| |
7
|
|
| |
8
|
Haralick, R. Document Image Understanding: geometric and logical layout, Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 8, 1994, 385--390.
|
| |
9
|
Jain, A., and Zhong,Y. Page segmentation using texture analysis, Pattern Recognition, vol. 29 (1996), 743--770.
|
| |
10
|
|
| |
11
|
|
| |
12
|
Lalanne, D., Sire, S., Ingold R., Behera, A., Mekhaldi, D., Rotz, D. V. A research agenda for assessing the utility of document annotations in multimedia databases of meeting recordings. 3rd Intl. Workshop on MDDE '03, in conjunction with VLDB-2003, Berlin, Germany, 2003.
|
 |
13
|
|
| |
14
|
Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. on Systems, Man and Cybernetics, 9, 1 (1979), 62--66.
|
| |
15
|
Ozawa, N., Takebe, H., Katsuyama, Y., Naoi, S., and Yakota, H. Slide identification for lecture movies by matching characters and images. In Proc. SPIE-Document Recognition and Retrieval XI, 5296 (2004), 74--81.
|
| |
16
|
|
| |
17
|
Shin, C., Doermann, D., and Rosenfeld, A. Classification of document pages using structure-based features. Int. J. Document Analysis and Recognition, 3, 2001, 232--247.
|
| |
18
|
|
| |
19
|
|
| |
20
|
Wong, K.Y., Casey, R.G., Wahl, F.M. Document Analysis system. IBM J. Res. Dev., 26, 1982, 647--656.
|
| |
21
|
Yang, J.Y., and Ersoy, O. K. Combined Supervised and Unsupervised Learning in Genomic Data Mining, Technical Report TR-ECE 03-10, ECE, Purdue University, West Lafayette, IN 47907-2035, 2003.
|
|