Artificial Intelligence in Recognition of Dynamic Gestures
Abstract
The development of effective methods for recognizing human hand gestures is an urgent task, both from a scientific and an applied point of view. Gesture recognition methods are the basis of contactless interfaces for managing technical systems. The most significant areas of application of gesture recognition systems are automatic sign language translation and contactless control of technical systems. Almost all existing implementations of recognition methods work confidently when the hand is on a uniform background. But in reality, such a case of using such implementations is unlikely. The likelihood of correct recognition drops sharply against a complex background when the hand in the image crosses the face or body parts with exposed skin. Additional restrictions on algorithms impose requirements for their hardware implementation. These should be compact devices on a chip (SoC) with low consumption, size and price, and, therefore, with low computing power. Taking into account the above, it should be recognized that it is relevant to solve the problem of increasing the efficiency of algorithms and methods of gesture recognition.
Two approaches to solving the problem of recognizing dynamic hand gestures are considered - analytical and neural network. It is shown that the use of artificial intelligence techniques can increase the reliability of gesture recognition in complex video surveillance conditions. However, the use of neural network algorithms cannot show high performance on single-board computers if they do not have an NPU or a powerful GPU module.
References
[2] Wang P. et al. Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks. arXiv:1608.06338, 2016. Available at: https://arxiv.org/abs/1608.06338 (accessed 21.08.2020). (In Eng.)
[3] Nahapetyan V.E., Khachumov V.M. Automatic transformation of Russian manual-alphabet gestures into textual form. Scientific and Technical Information Processing. 2014; 41(5):302-308. (In Eng.) DOI: https://doi.org/10.3103/S0147688214050037
[4] Kato M., Chen Y.-W., Xu G. Articulated hand tracking by PCA-ICA approach. In; 7th International Conference on Automatic Face and Gesture Recognition (FGR06). Southampton, UK; 2006. p. 329-334. (In Eng.) DOI: https://doi.org/10.1109/FGR.2006.21
[5] Wachs J.P., Kölsch M., Stern H., Edan Y. Vision-based hand-gesture applications. Communications of the ACM. 2011; 54(2):60-71. (In Eng.) DOI: https://doi.org/10.1145/1897816.1897838
[6] Szeliski R. Computer Vision. Texts in Computer Science. Texts in Computer Science. Springer, London; 2011. (In Eng.) DOI: https://doi.org/10.1007/978-1-84882-935-0
[7] Forsyth D.A., Ponce J. Computer Vision: A Modern Approach. 2nd ed. Prentice Hall; US; 2002. (In Eng.)
[8] Aggarwal J.K., Cai Q. Human Motion Analysis: A Review. Computer Vision and Image Understanding. 1999; 73(3):428-440. (In Eng.) DOI: https://doi.org/10.1006/cviu.1998.0744
[9] Rogalla O., Ehrenmann M., Zöllner R., Becher R., Dillmann R. Using gesture and speech control for commanding a robot assistant. In: Proceedings of the 11th IEEE International Workshop on Robot and Human Interactive Communication. Berlin, Germany; 2002. p. 454-459. (In Eng.) DOI: https://doi.org/10.1109/ROMAN.2002.1045664
[10] Schlömer T., Poppinga B., Henze N., Boll S. Gesture recognition with a Wii controller. In: Proceedings of the 2nd international conference on Tangible and embedded interaction (TEI '08). Association for Computing Machinery, New York, NY, USA; 2008. p. 11-14. (In Eng.) DOI: https://doi.org/10.1145/1347390.1347395
[11] Cheok M.J., Omar Z., Jaward M.H. A review of hand gesture and sign language recognition techniques. International Journal of Machine Learning and Cybernetics. 2019; 10(1):131-153. (In Eng.) DOI: https://doi.org/10.1007/s13042-017-0705-5
[12] Chethana N.S., Divyaprabha, Kurian M.Z. Design and Implementation of Static Hand Gesture Recognition System for Device Control. In: Shetty N., Prasad N., Nalini N. (ed.) Emerging Research in Computing, Information, Communication and Applications. 2016; 3:589-596. Springer, Singapore. (In Eng.) DOI: https://doi.org/10.1007/978-981-10-0287-8_54
[13] Lugaresi C. et al. Media Pipe: A Framework for Building Perception Pipelines. arXiv:1906.08172v1, 2019. Available at: https://arxiv.org/abs/1906.08172 (accessed 21.08.2020). (In Eng.)
[14] Mestetskiy L., Bakina I., Kurakin A. Hand Geometry Analysis by Continuous Skeletons. In: Kamel M., Campilho A. (ed.) Image Analysis and Recognition. ICIAR 2011. Lecture Notes in Computer Science. 2011; 6754:130-139. Springer, Berlin, Heidelberg. (In Eng.) DOI: https://doi.org/10.1007/978-3-642-21596-4_14
[15] Mitra S., Acharya T. Gesture Recognition: A Survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews). 2007; 37(3):311-324. (In Eng.) DOI: https://doi.org/10.1109/TSMCC.2007.893280
[16] Holte M.B., Moeslund T.B., Fihl P. Fusion of range and intensity information for view invariant gesture recognition. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Anchorage, AK, USA; 2008. p. 1-7. (In Eng.) DOI: https://doi.org/10.1109/CVPRW.2008.4563161
[17] Ren Z., Yuan J., Zhang Z. Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera. In: Proceedings of the 19th ACM international conference on Multimedia (MM '11). Association for Computing Machinery, New York, NY, USA; 2011. p. 1093-1096. (In Eng.) DOI: https://doi.org/10.1145/2072298.2071946
[18] Liao B., Li J., Ju Z., Ouyang G. Hand Gesture Recognition with Generalized Hough Transform and DC-CNN Using Realsense. In: 2018 Eighth International Conference on Information Science and Technology (ICIST). Cordoba, Granada, and Seville, Spain; 2018. p. 84-90. (In Eng.) DOI: https://doi.org/10.1109/ICIST.2018.8426125
[19] Mantecón T., del-Blanco C.R., Jaureguizar F., García N. Hand Gesture Recognition Using Infrared Imagery Provided by Leap Motion Controller. In: Blanc-Talon J., Distante C., Philips W., Popescu D., Scheunders P. (ed.) Advanced Concepts for Intelligent Vision Systems. ACIVS 2016. Lecture Notes in Computer Science. 2016; 10016:47-57. Springer, Cham. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-48680-2_5
[20] McCannon B.C. Rock Paper Scissors. Journal of Economics. 2007; 92(1):67-88. (In Eng.) DOI: https://doi.org/10.1007/s00712-007-0263-5
[21] Garg P., Aggarwal N., Sofat S. Vision Based Hand Gesture Recognition. International Journal of Computer and Information Engineering. 2009; 3(1):972-977. Available at: https://publications.waset.org/10237/pdf (accessed 21.08.2020). (In Eng.)
[22] Wu Y., Huang T.S. View-independent recognition of hand postures. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662). Hilton Head, SC, USA. 2000; 2:88-94. (In Eng.) DOI: https://doi.org/10.1109/CVPR.2000.854749
[23] Huang C.-L., Jeng S. A model-based hand gesture recognition system. Machine Vision and Applications. 2001; 12(5):243-258. (In Eng.) DOI: https://doi.org/10.1007/s001380050144
[24] Wu D., Zhu F., Shao L. One shot learning gesture recognition from RGBD images. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Providence, RI, USA; 2012. p. 7-12. (In Eng.) DOI: https://doi.org/10.1109/CVPRW.2012.6239179
[25] Keskin C., Kiraç F., Kara Y.E., Akarun L. Randomized decision forests for static and dynamic hand shape classification. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Providence, RI, USA; 2012. p. 31-36. (In Eng.) DOI: https://doi.org/10.1109/CVPRW.2012.6239183
[26] Dominio F., Donadeo M., Zanuttigh P. Combining multiple depth-based descriptors for hand gesture recognition. Pattern Recognition Letters. 2014; 50:101-111. (In Eng.) DOI: https://doi.org/10.1016/j.patrec.2013.10.010
[27] Ren Z., Meng J., Yuan J., Zhang Z. Robust hand gesture recognition with kinect sensor. In: Proceedings of the 19th ACM international conference on Multimedia (MM '11). Association for Computing Machinery, New York, NY, USA; 2011. p. 759-760. (In Eng.) DOI: https://doi.org/10.1145/2072298.2072443
[28] Yuan Q., Sclaroff S., Athitsos V. Automatic 2D Hand Tracking in Video Sequences. In: 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05). Breckenridge, CO, USA; 2005. vol. 1, p. 250-256. (In Eng.) DOI: https://doi.org/10.1109/ACVMOT.2005.27
[29] van den Bergh M. et al. Real-time 3D hand gesture interaction with a robot for understanding directions from humans. In: 2011 RO-MAN. Atlanta, GA, USA; 2011. p. 357-362. (In Eng.) DOI: https://doi.org/10.1109/ROMAN.2011.6005195
[30] Munoz-Salinas R., Medina-Carnicer R., Madrid-Cuevas F.J., Carmona-Poyato A. Depth silhouettes for gesture recognition. Pattern Recognition Letters. 2008; 29(3):319-329. (In Eng.) DOI: https://doi.org/10.1016/j.patrec.2007.10.011
[31] Lucas B.D., Kanade T. An Iterative Image Registration Technique with an Application to Stereo Vision. In: Proceedings of the 7th international joint conference on Artificial intelligence - Vol. (IJCAI'81). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA; 1981. p. 674-679. (In Eng.)
[32] Stenger B., Mendonça P.R.S., Cipolla R. Model-Based 3D Tracking of an Articulated Hand. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. Kauai, HI, USA; 2001. p. II-II. (In Eng.) DOI: https://doi.org/10.1109/CVPR.2001.990976
[33] Yacoob Y., Davis L.S. Recognizing human facial expressions from long image sequences using optical flow. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 1996; 18(6):636-642. (In Eng.) DOI: https://doi.org/10.1109/34.506414
[34] Zeng Z., Gong Q., Zhang J. CNN Model Design of Gesture Recognition Based on Tensorflow Framework. In: 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). Chengdu, China; 2019. p. 1062-1067. (In Eng.) DOI: https://doi.org/10.1109/ITNEC.2019.8729185
[35] Pisharady P.K., Saerbeck M. Recent methods and databases in vision-based hand gesture recognition: A review. Computer Vision and Image Understanding. 2015; 141:152-165. (In Eng.) DOI: https://doi.org/10.1016/j.cviu.2015.08.004
[36] Han X.H., Chen Y.W., Nakao Z. Robust Edge Detection by Independent Component Analysis in Noisy Images. IEICE TRANSACTIONS on Information and Systems. 2004; E87-D(9):2204-2211. (In Eng.)
[37] Shi J., Tomasi C. Good Features to Track. In: 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA; 1994. p. 593-600. (In Eng.) DOI: https://doi.org/10.1109/CVPR.1994.323794
[38] Bansal M., Kumar Mu., Kumar Ma., Kumar K. An efficient technique for object recognition using Shi-Tomasi corner detection algorithm. Soft Computing. 2012; 25(6):4423-4432. (In Eng.) DOI: https://doi.org/10.1007/s00500-020-05453-y
[39] Liu P., Li X., Cui H., Li S., Yuan Y. Hand Gesture Recognition Based on Single-Shot Multibox Detector Deep Learning. Mobile Information Systems. 2019; 2019:3410348. (In Eng.) DOI: https://doi.org/10.1155/2019/3410348

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.
