ABSTRACT
In this paper, we present a robust and font independent Gurmukhi OCR system, which performs reasonably well on old documents as well. The OCR is based on four classifiers operating in serial and parallel mode. For combining the results of the classifiers operating in parallel mode, a corpus based weighted voting method is used. Combining multiple classifiers in such a way, that their individual weaknesses are compensated while their individual strengths are preserved, results in significantly better performance than what can be achieved with a single classifier. The problem of broken characters, which frequently appear in old documents, has also been tackled using a structural feature based algorithm.
- Brill Eric and Jun Wu: Classifier Combination for Improved Lexical Disambiguation, Proceedings of the 17th international conference on Computational linguistics, vol. 1, pp. 191--195. Montreal, Quebec, Canada (1998). Google ScholarDigital Library
- Roli Fabio, Giacinto Giorgio, Vernazza Gianni: Methods for Designing Multiple Classifier Systems, Proceedings of the Second International Workshop on Multiple Classifier Systems, pp. 78--87. Springer-Verlag London, UK (2001). Google ScholarDigital Library
- Ludmila IK.,: A Theoretical Study on Six Classifier Fusion Strategies, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 24, No. 2, pp. 281--286, (2002.). Google ScholarDigital Library
- Kittler J., Hatef M., Duin RPW, Matas J.: On Combining Classifiers, IEEE Trans. On Pat. Analysis and Machine Intel., vol. 20, No. 3, pp. 226--239 (1998). Google ScholarDigital Library
- Prevost L., Michel-Sendis C., Moises A., Oudot L., Milgram M: Combining model-based and discriminative classifiers: application to handwritten character recognition, ICDAR'03 (2003). Google ScholarDigital Library
- Ke Chenl, Lanwang, Huisheng Chi: Methods of Combining Multiple Classifiers with Different Features and their Applications to Text-Independent Speaker Identification: International Journal of Pattern Recognition and Artificial Intelligence, Vol. 11, No. 3, pp. 417--445 (1997).Google ScholarCross Ref
- Lehal G. S., Singh Chandan,: A Complete Machine Printed Gurmukhi OCR System, Vivek, pp. 10--17, Vol. 16, No. 3. (2006).Google Scholar
- Benedicte Allier, Nadia Bali, Hubert Emptoz: Automatic accurate broken character restoration for patrimonial documents. IJDAR 8(4), pp 246--261 (2006)Google ScholarCross Ref
- Billawala, N., Hart, P. E., Pearis, M.: Image continuation. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 53--57, Tsukuba, Japan (1993)Google ScholarCross Ref
- Shi, Z., Govindaraju, V.: Character image enhancement by selective region-growing. Pattern Recognit. Lett. (17), pp. 523--527 (1996) Google ScholarDigital Library
- Yu. D., Yan, H.: Reconstruction of broken handwritten digits based on structural morphological features. Pattern Recognit. (34), pp. 235--254 (2001).Google ScholarCross Ref
- Whichello, A., Yan, H.: Linking broken character borders with variable sized masks to improve recognition. Pattern Recognition 29(8), pp. 1429--1435 (1996)Google ScholarCross Ref
- Bhattacharya U., Chaudhuri B. B.: A Majority Voting Scheme for Multi resolution Recognition of Hand printed Numerals, ICDAR'03, pp. 16--20, 3--6 (2003). Google ScholarDigital Library
- Lam L., Suen, C. Y.,: Application of Majority Voting to Pattern Recognition: An Analysis of its Behaviour and Performance, IEEE Trans. on System Man and Cyebrn-Part A: Systems and Humans, vol. 27, pp. 553--568 (1997). Google ScholarDigital Library
Index Terms
- Optical character recognition of Gurmukhi script using multiple classifiers
Recommendations
Online Handwritten Gurmukhi Words Recognition: An Inclusive Study
Identification of offline and online handwritten words is a challenging and complex task. In comparison to Latin and Oriental scripts, the research and study of handwriting recognition at word level in Indic scripts is at its initial phases. The two ...
Offline handwritten Gurmukhi character recognition: study of different feature-classifier combinations
DAR '12: Proceeding of the workshop on Document Analysis and RecognitionOffline handwritten character recognition (OHCR) is the method of converting handwritten text into machine processable layout. Since late sixties, efforts have been made for offline handwritten character recognition throughout the world. Principal ...
Comparison of HMM- and SVM-based stroke classifiers for Gurmukhi script
With the evolution of touch-based devices, development of handwriting recognition systems has received attention from many researchers. An online handwriting recognition system for Gurmukhi script is proposed in this paper. In this work, 74 stroke ...
Comments