Vision-Based Hand Gesture Spotting and Recognition Using CRF and SVM

Abstract

In this paper, a novel gesture spotting and recognition technique is proposed to handle hand gesture from continuous hand motion based on Conditional Random Fields in conjunction with Support Vector Machine. Firstly, YCbCr color space and 3D depth map are used to detect and segment the hand. The depth map is to neutralize complex background sense. Secondly, 3D spatio-temporal features for hand volume of dynamic affine-invariants like elliptic Fourier and Zernike moments are extracted, in addition to three orientations motion features. Finally, the hand gesture is spotted and recognized by using the discriminative Conditional Random Fields Model. Accordingly, a Support Vector Machine verifies the hand shape at the start and the end point of meaningful gesture, which enforces vigorous view invariant task. Experiments demonstrate that the proposed method can successfully spot and recognize hand gesture from continuous hand motion data with 92.50% recognition rate.

Share and Cite:

Ghaleb, F. , Youness, E. , Elmezain, M. and Dewdar, F. (2015) Vision-Based Hand Gesture Spotting and Recognition Using CRF and SVM. Journal of Software Engineering and Applications, 8, 313-323. doi: 10.4236/jsea.2015.87032.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Rose, R.C. (1992) Discriminant Word Spotting Techniques for Rejection Non-Vocabulary Utterances in Unconstrained Speech. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2, 105-108.
[2] Chen, F.R., Wilcox, L.D. and Bloomberg, D.S. (1993) Word Spotting in Scanned Images Using Hidden Markov Models. IEEE International Conference on Acoustics, Speech, and Signal Processing, 5, 1-4.
http://dx.doi.org/10.1109/icassp.1993.319732
[3] Starner, T., Weaver, J. and Pentland, A. (1998) Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video. IEEE Transaction on Pattern Analysis and Machine Intelligence, 20, 1371-1375. http://dx.doi.org/10.1109/34.735811
[4] Takahashi, K., Seki, S. and Oka, R. (1992) Spotting Recognition of Human Gestures from Motion Images. Technical Report IE92-134, 9-16.
[5] Baudel, T. and Beaudouin, M. (1993) CHARADE: Remote Control of Objects Using Free-Hand Gestures. Communications of ACM, 36, 28-35. http://dx.doi.org/10.1145/159544.159562
[6] Wexelblat, A. (1994) Natural Gesture in Virtual Environments. Proceedings of Virtual Reality Software and Technology Conference, Singapore, 23-26 August 1994, 5-16.
http://dx.doi.org/10.1142/9789814350938_0002
[7] Lee, H.-K. and Kim, J.H. (1999) An Hmm-Based Threshold Model Approach for Gesture Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21, 961-973.
http://dx.doi.org/10.1109/34.799904
[8] Yang, H.-D., Sclaroff, S. and Lee, S.-W. (2009) Sign Language Spotting with a Threshold Model Based on Conditional Random Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 1264-1277.http://dx.doi.org/10.1109/TPAMI.2008.172
[9] Elmezain, M. (2013) Adaptive Foreground with Cast Shadow Segmentation Using Gaussian Mixture Models and Invariant Color Features. International Journal of Engineering Science and Innovative Technology (IJESIT), 2, 438-445.
[10] Elmezain, M., Al-Hamadi, A., Niese, R. and Michaelis, B. (2009) A Robust Method for Hand Tracking Using Mean- Shift Algorithm and Kalman Filter in Stereo Color Image Sequences. International Conference on Computer Vision, Image and Signal Processing, PWASET, 59, 355-359.
[11] Ding, C. and He, X.F. (2004) K-Means Clustering via Principal Component Analysis. Proceedings of the 21st International Conference on Machine Learning, New York, 225-232.
[12] Nixon, M.S. and Aguado, A.S. (2002) Feature Extraction and Image Processing. Newnes, Central Tablelands.
[13] Ahmad, M. and Lee, S.-W. (2008) Human Action Recognition Using Shape and Clg Motion Flow from Multi-View Image Sequences. Journal of Pattern Recognition, 41, 2237-2252.
http://dx.doi.org/10.1016/j.patcog.2007.12.008
[14] Dugad, R., Ratakonda, K. and Ahuja, N. (1998) Robust Video Shot Change Detection. Workshop on Multimedia Signal Processing, Redondo Beach, 7-9 December 1998, 376-381.
http://dx.doi.org/10.1109/mmsp.1998.738965
[15] Dalal, N. and Triggs, B. (2005) Histograms of Oriented Gradients for Human Detection. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, 25-25 June 2005, 886-893. http://dx.doi.org/10.1109/cvpr.2005.177
[16] Elmezain, M., Al-Hamadi, A. and Michaelis, B. (2010) Robust Methods for Hand Gesture Spotting and Recognition Using Hidden Markov Models and Conditional Random Fields. IEEE Symposium on Signal Processing and Information Technology (ISSPIT), Luxor, 15-18 December 2010, 131-136.
http://dx.doi.org/10.1109/ISSPIT.2010.5711749

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.