Efficient Text Extraction Algorithm Using Color Clustering for Language Translation in Mobile Phone

Abstract

Many Text Extraction methodologies have been proposed, but none of them are suitable to be part of a real system implemented on a device with low computational resources, either because their accuracy is insufficient, or because their performance is too slow. In this sense, we propose a Text Extraction algorithm for the context of language translation of scene text images with mobile phones, which is fast and accurate at the same time. The algorithm uses very efficient computations to calculate the Principal Color Components of a previously quantized image, and decides which ones are the main foreground-background colors, after which it extracts the text in the image. We have compared our algorithm with other algorithms using commercial OCR, achieving accuracy rates more than 12% higher, and performing two times faster. Also, our methodology is more robust against common degradations, such as uneven illumination, or blurring. Thus, we developed a very attractive system to accurately separate foreground and background from scene text images, working over low computational resources devices.

Share and Cite:

A. Canedo-Rodríguez, J. Hyoun Kim, S. Kim, J. Kelly, J. Hee Kim, S. Yi, S. Kiran Veeramachaneni and Y. Blanco-Fernández, "Efficient Text Extraction Algorithm Using Color Clustering for Language Translation in Mobile Phone," Journal of Signal and Information Processing, Vol. 3 No. 2, 2012, pp. 228-237. doi: 10.4236/jsip.2012.32031.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] J. Liang, D. Doermann and H. P. Li, “Camera-Based Analysis of Text and Documents: A Survey,” International Journal on Document Analysis and Recognition, Vol. 7, No. 2-3, July 2005, pp. 84-104. doi:10.1007/s10032-004-0138-z
[2] C. Thillou and B. Gosselin, “Natural Scene Text Understanding,” In: Croatia Ed., Vision Systems: Segmentation and Pattern Recognition, I-Tech Education and Publishing, 2007, pp. 307-332.
[3] K. Jung, “Text Information Extraction in Images and Video: A Survey,” Pattern Recognition, Vol. 37, No. 5, 2004, pp. 977-997. doi:10.1016/j.patcog.2003.10.012
[4] A. Canedo-Rodríguez, J. Kim, S. Kim, et al., “Simple and Efficient Text Localization for Compressed Images in Mobile Phone,” Submitted to IEEE Transaction on Image Processing, 2009.
[5] N. Otsu, “A Threshold Selection Method from Gray Level Histograms,” IEEE Transactions on System, Man and Cybernetics, Vol. 9, No. 1, 1979, pp. 62-66.
[6] C. Thillou, S. Ferreira and B. Gosselin, “An Embedded Application for Degraded Text Recognition,” Journal on Applied Signal Processing, Vol. 2005, No. 13, 2005, pp. 2127-2135. doi:10.1155/ASP.2005.2127
[7] S. Messelodi and C. M. Modena, “Automatic Identification and Skew Estimation of Text Lines in Real Scene Images,” Pattern Recognition, Vol. 32, No. 5, 1992, pp. 791-810. doi:10.1016/S0031-3203(98)00108-3
[8] H. Li and D. Doermann, “Text Enhancement in Digital Video Using Multiple Frame Integration,” Proceedings of ACM International Conference on Multimedia, 1999, pp. 19-22.
[9] A. Zandifar, R. Duraiswami and L. S. Davis, “A VideoBased Framework for the Analysis of Presentations/Posters,” International Journal on Document Analysis and Recognition, Vol. 7, No. 2-3, 2005, pp. 178-187. doi:10.1007/s10032-004-0137-0
[10] W. Niblack, “An Introduction to Image Processing,” Prentice-Hall, Upper Saddle River, 1986, pp. 115-116.
[11] J. Sauvola and M. Pietikainen, “Adaptive Document Image Binarization,” Pattern Recognition, Vol. 33, No. 2, 2000, pp. 225-236. doi:10.1016/S0031-3203(99)00055-2
[12] J. Gllavata, R. Ewerth and B. Freisleben, “Finding Text in Images via Local Thresholding,” Proceedings of IEEE Symposium on Signal Processing and Information Technology, Siegen, 14-17 December 2003, pp. 539-542.
[13] I.-J. Kim, “Multi-Window Binarization of Camera Image for Document Recognition,” Ninth International Workshop on Frontiers in Handwriting Recognition, Inzisoft Co. Ltd., 26-29 October 2004, pp. 323-327.
[14] Y. Du, C.-I. Chang and P. D. Thouin, “Unsupervised Approach to Colour Video Thresholding,” Optical Engineering, Vol. 43, No. 2, 2004, pp. 282-289.
[15] J. Kim, S. Park and S. Kim, “Text Locating from Natural Scene Images Using Image Intensities,” Proceedings of International Conference on Document Analysis and Recognition, Seoul, August 31-September 1 2005, pp. 655-659.
[16] R. Lienhart and A. Wernicke, “Localising and Segmenting Text in Images, Videos and Web Pages,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 4, 2002, pp. 256-268. doi:10.1109/76.999203
[17] H. Hamza, E. Smigiel and A. Belaid, “Neural Based Binarisation Techniques,” Proceedings of International Conference on Document Analysis and Recognition, Seoul, August 31-September 1 2005, pp. 317-321.
[18] Z. Saidane and C. Garcia, “Robust Binarization for Video Text Recognition,” Ninth International Conference on Document Analysis and Recognition, Vol. 2, 2007, pp. 874-879.
[19] K. Sobottka, H. Bunke and H. Kronenberg, “Identification of Text on Colored Book and Journal Covers,” Proceedings of the Fifth International Conference on Document Analysis and Recognition, Bangalore, 20-22 September 1999, pp. 57-62.
[20] T. Perroud, K. Sobottka, H. Bunke and L. Hall, “Text Extraction from Colour Documents—Clustering Approaches in Three and Four Dimensions,” Proceedings of International Conference on Document Analysis and Recognition, 10-13 September 2001, pp. 937-941. doi:10.1109/ICDAR.2001.953923
[21] D. Comaniciu, “Nonparametric Robust Methods for Computer Vision,” Ph.D. Thesis, Rutgers University, Newark, 2000.
[22] D. Lopresti and J. Zhou, “Locating and Recognising Text in WWW Images,” Information Retrieval, Vol. 2, No. 2-3, 2000, pp. 177-206. doi:10.1023/A:1009954710479
[23] B. Wang, X.-F. Li, F. Liu and F.-Q. Hu, “Colour Text Image Binarisation Based on Binary Texture Analysis,” Proceedings of International Conference on Acoustics, Speech and Signal Processing, Shanghai, 17-21 May 2004, pp. 585-588.
[24] A-N. Lai and G. Lee, “Binarization by Local k-Means Clustering for Korean Text Extraction,” IEEE International Symposium on Signal Processing and Information Technology, Gwangju, 16-19 December 2008, pp. 117-122.
[25] C. Thillou and B. Gosselin, “Combination of Binarization and Character Segmentation Using Color Information,” Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, Mons, 18-21 December 2004, pp. 107-110. doi:10.1109/ISSPIT.2004.1433699
[26] J. Gao, J. Yang, Y. Zhang and A. Waibel. “Text Detection and Translation from Natural Scenes,” 2001. http://reports-archive.adm.cs.cmu.edu/anon/2001/abstracts/01-139.html
[27] J. Park, G. Lee, A.-N. Lai, E. Kim, J. Lim, S. Kim, H. Yang and S. Oh, “Automatic Detection and Recognition of Shop Name in Outdoor Signboard Images,” IEEE International Symposium on Signal Processing and Information Technology, Gwangju, 16-19 December 2008, pp. 111-116. doi:10.1109/ISSPIT.2008.4775652
[28] P. Berkhin, “Survey of Clustering Data Mining Techniques,” Technical Report, Accrue Software, 2002.
[29] O. D. Trier and T. Taxt, “Evaluation of Binarization Methods for Document Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 3, 1995, pp. 312-315. doi:10.1109/34.368197
[30] S. A. Mingoti and J. O. Lima, “Comparing SOM Neural Network with Fuzzy c-Means, K-Means and Traditional Hierarchical Clustering Algorithms,” European Journal of Operational Research, Vol. 174, No. 3, 2006, pp. 1742-1759. doi:10.1016/j.ejor.2005.03.039
[31] A. Canedo-Rodriguez, S. H. Kim, J. H. Kim and Y. BlancoFernandez, “English to Spanish Translation of Signboard Images from Mobile Phone Camera,” Southeast Conference, Atlanta, 5-8 March 2009.
[32] M. Celenk, “A Color Clustering Technique for Image Segmentation,” Graphical Models Image Process, Vol. 52, No. 3, 1990, pp. 145-170.
[33] R. M. Haralick and L. G. Shapiro, “Image Segmentation Techniques,” Computer Vision Graphics Image Process, Vol. 29, No. 1, 1985, pp. 100-132. doi:10.1016/S0734-189X(85)90153-7
[34] B. Schacter, L. Davis and A. Rosenfeld, “Scene Segmentation by Cluster Detection in Color Space,” University of Maryland, College Park, 1975.
[35] A. Sarabi and J. K. Aggarwal, “Segmentation of Chromatic Images,” Pattern Recognition, Vol. 13, No. 6, 1981, pp. 417-427. doi:10.1016/0031-3203(81)90004-2
[36] M. Junker and R. Hoch, “On the eEvaluation of Document Analysis Components by Recall, Precision, and Accuracy,” Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999, pp. 713-716.
[37] H.-K. Kim, “Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database,” Journal of Visual Communication and Image Represent, Vol. 7, No. 4, 1996, pp. 336-344. doi:10.1006/jvci.1996.0029
[38] C. M. Lee and A. Kankanhalli, “Automatic Extraction of Characters in Complex Images,” International Journal of Pattern Recognition Artificial Intelligence, Vol. 9, No. 1, 1995, pp. 67-82. doi:10.1142/S0218001495000043

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.