Искусственный интеллект в задачах распознавания динамических жестов

Yevgeniy Rashitovich Muratov; Mikhail Borisovich Nikiforov; Artem Mikhailovich Skachkov

doi:10.25559/SITITO.16.202004.883-892

Yevgeniy Rashitovich Muratov Рязанский государственный радиотехнический университет имени В.Ф. Уткина http://orcid.org/0000-0002-1664-3954
Mikhail Borisovich Nikiforov Рязанский государственный радиотехнический университет имени В.Ф. Уткина http://orcid.org/0000-0002-4796-0776
Artem Mikhailovich Skachkov Рязанский государственный радиотехнический университет имени В.Ф. Уткина http://orcid.org/0000-0001-7902-7668

DOI: https://doi.org/10.25559/SITITO.16.202004.883-892

Аннотация

Разработка эффективных методов распознавания жестов руки человека является актуальной задачей как с научной, так и с прикладной точки зрения. Методы распознавания жестов лежат в основе бесконтактных интерфейсов управления техническими системами. Наиболее значимыми областями применения систем распознавания жестов являются автоматический сурдоперевод и бесконтактное управление техническими системами. Почти все существующие реализации методов распознавания уверенно работают, когда рука находится на однородном фоне. Но в реальности такой случай использования подобных реализаций маловероятен. Вероятность правильного распознавания резко снижается на сложном фоне, когда рука на изображении пересекает лицо или элементы тела с открытой кожей. Дополнительные ограничения на алгоритмы накладывают требования к их аппаратной реализации. Это должны быть компактные устройства на кристалле (SoC) с малым потреблением, габаритами и ценой и, следовательно, с малой вычислительной мощностью. Учитывая сказанное, следует признать актуальным решения задачи по повышению эффективности алгоритмов и методов распознавания жестов.
Рассматриваются два подхода к решению задачи распознавания динамических жестов руки - аналитический и нейросетевой.Показано, что использование приемов искусственного интеллекта может повысить достоверность распознавания жестов в сложных условиях видеонаблюдения. Однако, применение нейросетевых алгоритмов не может показать высокую производительность на одноплатных компьютерах, если они не имеют NPU или производительного GPUмодуля.

Сведения об авторах

Yevgeniy Rashitovich Muratov, Рязанский государственный радиотехнический университет имени В.Ф. Уткина

доцент кафедры электронных вычислительных машин, кандидат технических наук, доцент

Mikhail Borisovich Nikiforov, Рязанский государственный радиотехнический университет имени В.Ф. Уткина

директор НОЦ "СпецЭВМ", заместитель заведующего кафедрой электронных вычислительных машин, кандидат технических наук, доцент, член-корреспондент Академии информатизации образования

Artem Mikhailovich Skachkov, Рязанский государственный радиотехнический университет имени В.Ф. Уткина

магистрант кафедры электронных вычислительных машин

Литература

[1] Cho Y., Lee A., Park J., Ko B., Kim N. Enhancement of gesture recognition for contactless interface using a personalized classifier in the operating room. Computer Methods and Programs in Biomedicine. 2018; 161:39-44. (In Eng.) DOI: https://doi.org/10.1016/j.cmpb.2018.04.003
[2] Wang P. et al. Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks. arXiv:1608.06338, 2016. Available at: https://arxiv.org/abs/1608.06338 (accessed 21.08.2020). (In Eng.)
[3] Nahapetyan V.E., Khachumov V.M. Automatic transformation of Russian manual-alphabet gestures into textual form. Scientific and Technical Information Processing. 2014; 41(5):302-308. (In Eng.) DOI: https://doi.org/10.3103/S0147688214050037
[4] Kato M., Chen Y.-W., Xu G. Articulated hand tracking by PCA-ICA approach. In; 7th International Conference on Automatic Face and Gesture Recognition (FGR06). Southampton, UK; 2006. p. 329-334. (In Eng.) DOI: https://doi.org/10.1109/FGR.2006.21
[5] Wachs J.P., Kölsch M., Stern H., Edan Y. Vision-based hand-gesture applications. Communications of the ACM. 2011; 54(2):60-71. (In Eng.) DOI: https://doi.org/10.1145/1897816.1897838
[6] Szeliski R. Computer Vision. Texts in Computer Science. Texts in Computer Science. Springer, London; 2011. (In Eng.) DOI: https://doi.org/10.1007/978-1-84882-935-0
[7] Forsyth D.A., Ponce J. Computer Vision: A Modern Approach. 2nd ed. Prentice Hall; US; 2002. (In Eng.)
[8] Aggarwal J.K., Cai Q. Human Motion Analysis: A Review. Computer Vision and Image Understanding. 1999; 73(3):428-440. (In Eng.) DOI: https://doi.org/10.1006/cviu.1998.0744
[9] Rogalla O., Ehrenmann M., Zöllner R., Becher R., Dillmann R. Using gesture and speech control for commanding a robot assistant. In: Proceedings of the 11th IEEE International Workshop on Robot and Human Interactive Communication. Berlin, Germany; 2002. p. 454-459. (In Eng.) DOI: https://doi.org/10.1109/ROMAN.2002.1045664
[10] Schlömer T., Poppinga B., Henze N., Boll S. Gesture recognition with a Wii controller. In: Proceedings of the 2nd international conference on Tangible and embedded interaction (TEI '08). Association for Computing Machinery, New York, NY, USA; 2008. p. 11-14. (In Eng.) DOI: https://doi.org/10.1145/1347390.1347395
[11] Cheok M.J., Omar Z., Jaward M.H. A review of hand gesture and sign language recognition techniques. International Journal of Machine Learning and Cybernetics. 2019; 10(1):131-153. (In Eng.) DOI: https://doi.org/10.1007/s13042-017-0705-5
[12] Chethana N.S., Divyaprabha, Kurian M.Z. Design and Implementation of Static Hand Gesture Recognition System for Device Control. In: Shetty N., Prasad N., Nalini N. (ed.) Emerging Research in Computing, Information, Communication and Applications. 2016; 3:589-596. Springer, Singapore. (In Eng.) DOI: https://doi.org/10.1007/978-981-10-0287-8_54
[13] Lugaresi C. et al. Media Pipe: A Framework for Building Perception Pipelines. arXiv:1906.08172v1, 2019. Available at: https://arxiv.org/abs/1906.08172 (accessed 21.08.2020). (In Eng.)
[14] Mestetskiy L., Bakina I., Kurakin A. Hand Geometry Analysis by Continuous Skeletons. In: Kamel M., Campilho A. (ed.) Image Analysis and Recognition. ICIAR 2011. Lecture Notes in Computer Science. 2011; 6754:130-139. Springer, Berlin, Heidelberg. (In Eng.) DOI: https://doi.org/10.1007/978-3-642-21596-4_14
[15] Mitra S., Acharya T. Gesture Recognition: A Survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews). 2007; 37(3):311-324. (In Eng.) DOI: https://doi.org/10.1109/TSMCC.2007.893280
[16] Holte M.B., Moeslund T.B., Fihl P. Fusion of range and intensity information for view invariant gesture recognition. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Anchorage, AK, USA; 2008. p. 1-7. (In Eng.) DOI: https://doi.org/10.1109/CVPRW.2008.4563161
[17] Ren Z., Yuan J., Zhang Z. Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera. In: Proceedings of the 19th ACM international conference on Multimedia (MM '11). Association for Computing Machinery, New York, NY, USA; 2011. p. 1093-1096. (In Eng.) DOI: https://doi.org/10.1145/2072298.2071946
[18] Liao B., Li J., Ju Z., Ouyang G. Hand Gesture Recognition with Generalized Hough Transform and DC-CNN Using Realsense. In: 2018 Eighth International Conference on Information Science and Technology (ICIST). Cordoba, Granada, and Seville, Spain; 2018. p. 84-90. (In Eng.) DOI: https://doi.org/10.1109/ICIST.2018.8426125
[19] Mantecón T., del-Blanco C.R., Jaureguizar F., García N. Hand Gesture Recognition Using Infrared Imagery Provided by Leap Motion Controller. In: Blanc-Talon J., Distante C., Philips W., Popescu D., Scheunders P. (ed.) Advanced Concepts for Intelligent Vision Systems. ACIVS 2016. Lecture Notes in Computer Science. 2016; 10016:47-57. Springer, Cham. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-48680-2_5
[20] McCannon B.C. Rock Paper Scissors. Journal of Economics. 2007; 92(1):67-88. (In Eng.) DOI: https://doi.org/10.1007/s00712-007-0263-5
[21] Garg P., Aggarwal N., Sofat S. Vision Based Hand Gesture Recognition. International Journal of Computer and Information Engineering. 2009; 3(1):972-977. Available at: https://publications.waset.org/10237/pdf (accessed 21.08.2020). (In Eng.)
[22] Wu Y., Huang T.S. View-independent recognition of hand postures. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662). Hilton Head, SC, USA. 2000; 2:88-94. (In Eng.) DOI: https://doi.org/10.1109/CVPR.2000.854749
[23] Huang C.-L., Jeng S. A model-based hand gesture recognition system. Machine Vision and Applications. 2001; 12(5):243-258. (In Eng.) DOI: https://doi.org/10.1007/s001380050144
[24] Wu D., Zhu F., Shao L. One shot learning gesture recognition from RGBD images. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Providence, RI, USA; 2012. p. 7-12. (In Eng.) DOI: https://doi.org/10.1109/CVPRW.2012.6239179
[25] Keskin C., Kiraç F., Kara Y.E., Akarun L. Randomized decision forests for static and dynamic hand shape classification. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Providence, RI, USA; 2012. p. 31-36. (In Eng.) DOI: https://doi.org/10.1109/CVPRW.2012.6239183
[26] Dominio F., Donadeo M., Zanuttigh P. Combining multiple depth-based descriptors for hand gesture recognition. Pattern Recognition Letters. 2014; 50:101-111. (In Eng.) DOI: https://doi.org/10.1016/j.patrec.2013.10.010
[27] Ren Z., Meng J., Yuan J., Zhang Z. Robust hand gesture recognition with kinect sensor. In: Proceedings of the 19th ACM international conference on Multimedia (MM '11). Association for Computing Machinery, New York, NY, USA; 2011. p. 759-760. (In Eng.) DOI: https://doi.org/10.1145/2072298.2072443
[28] Yuan Q., Sclaroff S., Athitsos V. Automatic 2D Hand Tracking in Video Sequences. In: 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05). Breckenridge, CO, USA; 2005. vol. 1, p. 250-256. (In Eng.) DOI: https://doi.org/10.1109/ACVMOT.2005.27
[29] van den Bergh M. et al. Real-time 3D hand gesture interaction with a robot for understanding directions from humans. In: 2011 RO-MAN. Atlanta, GA, USA; 2011. p. 357-362. (In Eng.) DOI: https://doi.org/10.1109/ROMAN.2011.6005195
[30] Munoz-Salinas R., Medina-Carnicer R., Madrid-Cuevas F.J., Carmona-Poyato A. Depth silhouettes for gesture recognition. Pattern Recognition Letters. 2008; 29(3):319-329. (In Eng.) DOI: https://doi.org/10.1016/j.patrec.2007.10.011
[31] Lucas B.D., Kanade T. An Iterative Image Registration Technique with an Application to Stereo Vision. In: Proceedings of the 7th international joint conference on Artificial intelligence - Vol. (IJCAI'81). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA; 1981. p. 674-679. (In Eng.)
[32] Stenger B., Mendonça P.R.S., Cipolla R. Model-Based 3D Tracking of an Articulated Hand. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. Kauai, HI, USA; 2001. p. II-II. (In Eng.) DOI: https://doi.org/10.1109/CVPR.2001.990976
[33] Yacoob Y., Davis L.S. Recognizing human facial expressions from long image sequences using optical flow. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 1996; 18(6):636-642. (In Eng.) DOI: https://doi.org/10.1109/34.506414
[34] Zeng Z., Gong Q., Zhang J. CNN Model Design of Gesture Recognition Based on Tensorflow Framework. In: 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). Chengdu, China; 2019. p. 1062-1067. (In Eng.) DOI: https://doi.org/10.1109/ITNEC.2019.8729185
[35] Pisharady P.K., Saerbeck M. Recent methods and databases in vision-based hand gesture recognition: A review. Computer Vision and Image Understanding. 2015; 141:152-165. (In Eng.) DOI: https://doi.org/10.1016/j.cviu.2015.08.004
[36] Han X.H., Chen Y.W., Nakao Z. Robust Edge Detection by Independent Component Analysis in Noisy Images. IEICE TRANSACTIONS on Information and Systems. 2004; E87-D(9):2204-2211. (In Eng.)
[37] Shi J., Tomasi C. Good Features to Track. In: 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA; 1994. p. 593-600. (In Eng.) DOI: https://doi.org/10.1109/CVPR.1994.323794
[38] Bansal M., Kumar Mu., Kumar Ma., Kumar K. An efficient technique for object recognition using Shi-Tomasi corner detection algorithm. Soft Computing. 2012; 25(6):4423-4432. (In Eng.) DOI: https://doi.org/10.1007/s00500-020-05453-y
[39] Liu P., Li X., Cui H., Li S., Yuan Y. Hand Gesture Recognition Based on Single-Shot Multibox Detector Deep Learning. Mobile Information Systems. 2019; 2019:3410348. (In Eng.) DOI: https://doi.org/10.1155/2019/3410348