Remote Study as a Modern Format to Improve Our Lip-Reading Skills

Based on our Homovisemes Corpus

Abstract

This article continues our work, where we describe possible barriers for a person with hearing impairments to visually perceive and understanding verbal speech by observing the speaker's articulations. It is difficult to correctly understand visual information since one viseme corresponds to several phonemes. This affects how different words with similar articulation patterns (homovisemes) form up in verbal speech. These words are defined by the term "homovisemes" introduced by the authors. There are many such words in Russian speech. So, these words are interchangeable and thus indistinguishable from each other, not only out of context, but sometimes even when a context is available. The purpose of this research is to analyze if it is possible and expedient to use our homovisemes corpus to solve certain difficulties in how we learn to read from lips. Currently, one of the promising areas in education is remote study. In the article, we investigate how to use a distance form to analyze a structure of lip-reading based on our homovisemes corpus of Russian. This corpus is based on our prepared material, grouped into separate chains according to the principle of identical articulatory shells of words; pseudo-words are excluded. It is intended to familiarize the user with homovisemes in oral Russian speech. This will allow him to replenish his vocabulary and easily visually perceive an oral message. We present examples of possible variants of recognized words with similar visemes based on our developed corpus. Our results allow us to conclude on how relevant the chosen topic is. The results will help us to predict uncertainty when we choose the right value. Remote study is a modern study format. It is aimed to develop personal abilities of people with hearing impairments, to improve their skills to understand oral utterance, and to socially adapt these people in our society.

Author Biographies

Maria Alexandrovna Myasoedova, V.A. Trapeznikov Institute of Control Sciences, Russian Academy of Sciences

Researcher

Zinaida Pavlovna Myasoedova, V.A. Trapeznikov Institute of Control Sciences, Russian Academy of Sciences

Researcher

References

1. Jang D., Kim H., Je C., Park R., Park H. Lip Reading Using Committee Networks With Two Different Types of Concatenated Frame Images. IEEE Access. 2019; 7:90125-90131. (In Eng.) doi: https://doi.org/10.1109/ACCESS.2019.2927166
2. Myasoedova M.A., Myasoedova Z.P. Interlingual Homonymy Hinders Communication when a Person Reads Foreign Words from the Lips (from the Position of a Native Russian Speaker). Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2020; 16(2):379-388. (In Eng.) doi: https://doi.org/10.25559/SITITO.16.202002.379-388
3. Cutler A., Butterfield S. Word boundary cues in clear speech: A supplementary report. Speech Communication. 1991; 10(4):335-353. (In Eng.) doi: https://doi.org/10.1016/0167-6393(91)90002-B
4. Cheng S., et al. Towards Pose-Invariant Lip-Reading. ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE Computer Society, Barcelona, Spain; 2020. p. 4357-4361. (In Eng.) doi: https://doi.org/10.1109/ICASSP40776.2020.9054384
5. Ganzeyeva E.O. Problema rechevospriyatiya i recheponimaniya v sovremennoy nauke [The problem of speech perception and speech understanding in modern science]. Materialy yezhegodnoy nauchnoy konferentsii prepodavateley i aspirantov universiteta = Proceedings of the annual scientific conference of teachers and graduate students of the university. Minsk State Linguistic University, Minsk; 2021. Part 1. p. 66-68. Available at: http://e-lib.mslu.by/handle/edoc/8507 (accessed 28.05.2022). (In Russ., abstract in Eng.)
6. Xiao J., Yang S., Zhang Y., Shan S., Chen X. Deformation Flow Based Two-Stream Network for Lip Reading. 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). IEEE Computer Society, Buenos Aires, Argentina; 2020. p. 364-370. (In Eng.) doi: https://doi.org/10.1109/FG47880.2020.00132
7. Plungyan V.A. Korpus kak instrument i kak ideologiya: o nekotorykh urokakh sovremennoy korpusnoy lingvistiki [Corpus as a tool and as an ideology: on some lessons of modern corpus linguistics]. Russkij jazyk v nauchnom osveshchenii = Russian Language and Linguistic Theory. 2008; (2):7-20. Available at: https://elibrary.ru/item.asp?id=15127694 (accessed 28.05.2022). (In Russ.)
8. Savchuk S.O., Sichinava D.V. Obuchayushchiy korpus russkogo yazyka i yego ispol’zovaniye v prepodavatel’skoy praktike [Russian Educational Corpus and its use in teaching practice]. Natsional’nyy korpus russkogo yazyka: 2006-2008. Novyye rezul’taty i perspektivy = Russian National Corpus: 2006-2008. New results and prospects. Saint Petersburg, Nestor-Istoriya Publ.; 2009. p. 317-334. Available at: https://elibrary.ru/item.asp?id=18933892 (accessed 28.05.2022). (In Russ.)
9. Nagel' O.V. Korpusnaya lingvistika i yeye ispol'zovaniye v komp'yuterizirovannom yazykovom obuchenii [Corpus linguistics and its use in computer-based language teaching]. Jazyk i kul'tura = Language and Culture. 2008; (4):53-59. Available at: https://elibrary.ru/item.asp?id=11990991 (accessed 28.05.2022). (In Russ., abstract in Eng.)
10. Irgizova K.V. Korpusnaya lingvistika v otechestvennom i zarubezhnom yazykoznanii na sovremennom etape [Current state of Russian and international corpus linguistics]. Ogarev-online. 2019; (6):1-9. Available at: https://elibrary.ru/item.asp?id=39195626 (accessed 28.05.2022). (In Russ., abstract in Eng.)
11. Al-Hamzi A.M.S., Gougui A., Sari Amalia Y., Suhardijanto T. Corpus Linguistics and Corpus-Based Research and Its Implication in Applied Linguistics: A Systematic Review. PAROLE: Journal of Linguistics and Education. 2020; 10(2):176-181. (In Eng.) doi: https://doi.org/10.14710/parole.v10i2.176-181
12. Petajan E., Bischoff B., Bodoff D., Brooke N.M. An improved automatic lipreading system to enhance speech recognition. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'88). Association for Computing Machinery, New York, NY, USA; 1988. p. 19-25. (In Eng.) doi: https://doi.org/10.1145/57167.57170
13. Myasoyedova M.A., Myasoedova Z.P. Computer assessment of how well a person visually recognizes verbal Russian speech. 2018 IEEE 12th International Conference on Application of Information and Communication Technologies (AICT). IEEE Computer Society, Almaty, Kazakhstan; 2018. p. 1-5. (In Eng.) doi: https://doi.org/10.1109/ICAICT.2018.8747072
14. Myasoyedova M.A., Myasoyedova Z.P., Farkhadov M.P. Articulatory Uncertainty as a Result of Visual Recognition of Modified Sounds in Russian Speech. Proceedings of the 11th IEEE International Conference on Application of Information and Communication Technologies (AICT2017). IEEE Computer Society, Moscow, Russian Federation; 2017. Vol. 1. p. 169-172. (In Eng.) doi: https://doi.org/10.1109/ICAICT.2017.8686938
15. Matveeva L.Y. Veroyatnostnoye prognozirovaniye zvuchashchey rechi: k postanovke problemy (obzor) [Probable prognosis of speech: the issue considering (review)]. Saratovskiy nauchno-meditsinskiy zhurnal = Saratov Journal of Medical Scientific Research. 2015; 11(2):216-220. Available at: https://elibrary.ru/item.asp?id=27121628 (accessed 28.05.2022). (In Russ., abstract in Eng.)
16. Ibrahim M.Z., Mulvaney D.J. Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping. Journal of Visual Communication and Image Representation. 2015; 30:219-233. (In Eng.) doi: https://doi.org/10.1016/j.jvcir.2015.04.013
17. Peymanfard J., Reza Mohammadi M., Zeinali H., Mozayani N. Lip reading using external viseme decoding. 2022 International Conference on Machine Vision and Image Processing (MVIP). IEEE Computer Society; 2022. p. 1-5. (In Eng.) doi: https://doi.org/10.1109/MVIP53647.2022.9738749
18. Sheng C., Zhu X., Xu H., Pietikäinen M., Liu L. Adaptive Semantic-Spatio-Temporal Graph Convolutional Network for Lip Reading. IEEE Transactions on Multimedia. 2022; 24:3545-3557. (In Eng.) doi: https://doi.org/10.1109/TMM.2021.3102433
19. Qu L., Weber C., Wermter S. LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading. IEEE Transactions on Neural Networks and Learning Systems. 2022. p. 1-11. (In Eng.) doi: https://doi.org/10.1109/TNNLS.2022.3191677
20. Ma P., Wang Y., Petridis S., Shen J., Pantic M. Training Strategies for Improved Lip-Reading. 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE Computer Society; 2022. p. 8472-8476. (In Eng.) doi: https://doi.org/10.1109/ICASSP43922.2022.9746706
21. Deng M., Xiong S. Phoneme-based lipreading of silent sentences. 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC). IEEE Computer Society; 2022. p. 206-210. (In Eng.) doi: https://doi.org/10.1109/IPEC54454.2022.9777317
22. Prajwal K., Afouras T., Zisserman A. Sub-word Level Lip Reading With Visual Attention. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, New Orleans, LA, USA; 2022. p. 5152-5162. (In Eng.) doi: https://doi.org/10.1109/CVPR52688.2022.00510
23. Wang H., Pu G., Chen T. A Lip Reading Method Based on 3D Convolutional Vision Transformer // IEEE Access, 2022. Vol. 10. Pp. 77205-77212. (In Eng.) doi: https://doi.org/10.1109/ACCESS.2022.3193231
24. Zhang X., et al. Boosting Lip Reading with a Multi-View Fusion Network. 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE Computer Society, Taipei, Taiwan. 2022. p. 1-6. (In Eng.) doi: https://doi.org/10.1109/ICME52920.2022.9859810
25. Ren S., Du Y., Lv J., Han G., He S. Learning from the Master: Distilling Cross-modal Advanced Knowledge for Lip Reading. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Nashville, TN, USA. 2021. p. 13320-13328. (In Eng.) doi: https://doi.org/10.1109/CVPR46437.2021.01312
Published
2022-07-20
How to Cite
MYASOEDOVA, Maria Alexandrovna; MYASOEDOVA, Zinaida Pavlovna. Remote Study as a Modern Format to Improve Our Lip-Reading Skills. Modern Information Technologies and IT-Education, [S.l.], v. 18, n. 2, p. 374-382, july 2022. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/867>. Date accessed: 09 oct. 2025. doi: https://doi.org/10.25559/SITITO.18.202202.374-382.
Section
Research and development in the field of new IT and their applications