Improving the OCR of Low Contrast, Small Fonts, Dark Background Forms Using Correlated Zoom and Resolution Technique (CZRT)

Abstract

Many formal institutions, companies, hospitals, laboratories need some time to exchange hand signed reports through modern communication means such as Fax, E-mails, and others. A problem is faced due to the quality of both scanned documents and originally used paper, which results in problems in converting such images to text. In addition, font type and size, contrast and background darkness have an adverse effect on the accuracy of the resulted text. Thus, an investigation into the relationship between scanned document zoom and scanning resolution in Dots per Inch (DPI) for a special case and type of scanned forms is carried out to enable design of an algorithm that takes into account such cases. It is found that a much higher level of zooming and resolution is needed to achieve acceptable recognition for the special case of dark, low contrast, small font forms. It is also found that the optimum zooming level is set by the number of recognized words as they are more difficult to learn and analyze.

Share and Cite:

Iskandarani, M. (2015) Improving the OCR of Low Contrast, Small Fonts, Dark Background Forms Using Correlated Zoom and Resolution Technique (CZRT). Journal of Data Analysis and Information Processing, 3, 34-42. doi: 10.4236/jdaip.2015.33005.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Al-Fatlawi, A.H., Ling, S.H. and Lam, H.K. (2014) A Comparison of Neural Classifiers for Graffiti Recognition. Journal of Intelligent Learning Systems and Applications, 6, 94-112.
http://dx.doi.org/10.4236/jilsa.2014.62008
[2] Rajam, S. and Balakrishnan, G. (2012) Recognition of Tamil Sign Language Alphabet Using Image Processing to Aid Deaf-Dumb People. Procedia Engineering, 30, 681-686.
http://dx.doi.org/10.1016/j.proeng.2012.01.938
[3] Al-Rousan, M., Assaleh, K. and Tala’a, A. (2009) Video-Based Signer-Independent Arabic Sign Language Recognition Using Hidden Markov Models. Applied Soft Computing, 9, 990-999.
http://dx.doi.org/10.1016/j.asoc.2009.01.002
[4] Solis, F., Hernandez, M., Perez, A. and Toxqui, C. (2014) Static Digits Recognition Using Rotational Signatures and Hu Moments with a Multilayer Perceptron. Engineering, 6, 692-698.
http://dx.doi.org/10.4236/eng.2014.611068
[5] Widiarti, A.R., Harjoko, A., Marsono and Hartati, S. (2014) Preprocessing Model of Manuscripts in Javanese Characters. Journal of Signal and Information Processing, 5, 112-122.
http://dx.doi.org/10.4236/jsip.2014.54014
[6] Pai, N. and Kolkure, V.S. (2015) Optical Character Recognition: An Encompassing Review. International Journal of Research in Engineering and Technology, 4, 407-409.
http://dx.doi.org/10.15623/ijret.2015.0401062
[7] Li, X.G., Chen, J.H. and Li, Z.J. (2013) English Sentence Recognition Based on HMM and Clustering. American Journal of Computational Mathematics, 3, 37-42.
http://dx.doi.org/10.4236/ajcm.2013.31005
[8] Shaffie, A.M. and Elkobrosy, G.A. (2013) A Fast Recognition System for Isolated Printed Characters Using Center of Gravity and Principal Axis. Applied Mathematics, 4, 1313-1319.
http://dx.doi.org/10.4236/am.2013.49177
[9] Naz, S., Hayat, H., Razzak, M.I., Anwar, M.W., Madani, S.A. and Khan, S.U. (2014) The Optical Character Recognition of Urdu-Like Cursive Scripts. Pattern Recognition, 47, 1229-1249.
http://dx.doi.org/10.1016/j.patcog.2013.09.037
[10] Singh, J. and Lehal, G.S. (2014) Comparative Performance Analysis of Feature (S)-Classifier Combination for Devanagari Optical Character Recognition System. International Journal of Advanced Computer Science and Applications, 5, 37-42.
http://dx.doi.org/10.5120/15048-3416
[11] Xu, Y., Huang, X., Chen, H. and Jiang, H. (2012) A New Method for Chinese Character Strokes Recognition. Open Journal of Applied Sciences, 2, 184-187.
http://dx.doi.org/10.4236/ojapps.2012.23027
[12] Wazalwar, D., Oruklu, E. and Saniie, J. (2012) A Design Flow for Robust License Plate Localization and Recognition in Complex Scenes. Journal of Transportation Technologies, 2, 13-21.
http://dx.doi.org/10.4236/jtts.2012.21002
[13] Patel, C., Patel, A. and Patel, D. (2012) Optical Character Recognition by Open Source OCR Tool Tesseract: A Case Study. International Journal of Computer Applications, 55, 50-56.
http://dx.doi.org/10.5120/8794-2784

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.