The Improved Algorithm for Calculation of the Contextual Words Meaning in the Text
Abstract
Some modifications of the algorithm for context calculation, published in [1], are considered. A new solution for word and document context calculation is proposed. To improve a context determination it is proposed to take into consideration distances between words W1 and W2. This approach is especially important, when W2 number is >1. The results of investigations of these two formulas are presented. For efficiency comparison of these formulas calculation has been made for 100 texts. There were built distributions for C average and dispersion, which were compared with model data from [1]. The weight function has been optimized. The versions comparison was made according to the value of s/Сaver. The C dispersion was calculated for all version of the weight function. Dispersion of C appeared to be rather big because of great variation of text size, number W2 and W3, as well as wide distribution of words in the text. There is an example of L distribution for W2=”компьютер”.
References
[2] Dorenskaya E.A., Semenov Y.A. About the Programming Techniques, Oriented to Minimize Errors. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2017; 13(2):50-56. (In Russ., abstract in Eng.) DOI: 10.25559/SITITO.2017.2.226
[3] Dorenskaya E.A., Semenov Y.A. New Methods of Minimizing the Errors in the Software. In: CEUR Workshop Proceedings: Proceedings of the VIII International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018), Dubna, Moscow region, Russia, September 10 -14, 2018, vol. 2267. 2018, pp. 150-154. Available at: http://ceur-ws.org/Vol-2267/150-154-paper-27.pdf (accessed 15.08.2019). (In Eng.)
[4] Semenov Y.A., Ovsyannikov A.P., Ovsyannikova T.V. Development of the algorithm bank and basics of the language for problem description to minimize a number of program errors. Proceedings of NIISI RAS. 2016; 6(2):96-100. Available at: https://elibrary.ru/item.asp?id=29798446 (accessed 15.08.2019). (In Russ., abstract in Eng.)
[5] Semenov Y.A. IT-Economy in 2016 and in 10 Years. Economic Strategies. 2017; 19(1):126-135. Available at:
[6] Rishel T., Perkins L.A., Yenduri S., Zand F. Determining the context of text using augmented latent semantic indexing. Journal of the American Society for Information Science and Technology. 2007; 58(14):2197-2204. (In Eng.) DOI: 10.1002/asi.20687
[7] Chen J., Scholz U., Zhou R., Lange M. LAILAPS-QSM: A RESTful API and JAVA library for semantic query suggestions. PLoS Computational Biology. 2018; 14(3):e1006058. (In Eng.) DOI: 10.1371/journal.pcbi.1006058
[8] Yang L., Zhang J. Automatic transfer learning for short text mining. Eurasip Journal on Wireless Communications and Networking. 2017; 2017(1):42. (In Eng.) DOI: 10.1186/s13638-017-0815-5
[9] Yan E., Williams J., Chen Z. Understanding disciplinary vocabularies using a full-text enabled domain-independent term extraction approach. PLoS ONE. 2017; 12(11):e0187762. (In Eng.) DOI: 10.1371/journal.pone.0187762
[10] Arras L., Horn F., Montavon G., Müller K.-R., Samek W. What is relevant in a text document?": An interpretable machine learning approach. PLoS ONE. 2017; 12(8):e0181142. (In Eng.) DOI: 10.1371/journal.pone.0181142
[11] Eidlin A.A., Eidlina M.A., Samsonovich A.V. Analyzing weak semantic map of word senses. Procedia Computer Science. 2018; 123:140-148. (In Eng.) DOI:
[12] Samsonovich A.V. Weak Semantic Map of the Russian Language: Preliminary Results. Procedia Computer Science. 2016; 88:538-543. (In Eng.) DOI: 10.1016/j.procs.2016.08.001
[13] Wei T., Lu Y., Chang H., Zhou Q., Bao X. A semantic approach for text clustering using WordNet and lexical chains. Expert Systems with Applications. 2015; 42(4):2264-2275. (In Eng.) DOI: 10.1016/j.eswa.2014.10.023
[14] Zhan J., Dahal B. Using deep learning for short text understanding. Journal of Big Data. 2017; 4(1):34. (In Eng.) DOI: 10.1186/s40537-017-0095-2
[15] Khenner E., Nasraoui O. A bilingual semantic network of computing concepts. Procedia Computer Science. 2016; 80:2392-2396. (In Eng.) DOI: 10.1016/j.procs.2016.05.460
[16] Yu B. Research on information retrieval model based on ontology. EURASIP Journal on Wireless Communications and Networking. 2019; 2019(1):30. (In Eng.) DOI: 10.1186/s13638-019-1354-z
[17] Yelkina E.E., Kononova O.V., Prokudin D.E. Typology of Contexts and Contextual Approach Principles in Multidisciplinary Scientific Research. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2019; 15(1):141-153. (In Russ., abstract in Eng.) DOI: 10.25559/SITITO.15.201901.141-153
[18] Komrakov A.A. Using Ontologies to Describe the Structure of Arrays of Information Exchange. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2019; 15(1):182-189. (In Russ., abstract in Eng.) DOI: 10.25559/SITITO.15.201901.182-189
[19] Barakhnin V.B., Kozhemyakina O.Yu., Rychkova E.V., Pastushkov I.S., Borzilova Y.S. The extraction of lexical and metrorhythmic features which are characteristic for the genre and the style and for their combinations within the process of automated processing of texts in Russian. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2018; 14(4):888-895. (In Russ., abstract in Eng.) DOI: 10.25559/SITITO.14.201804.888-895
[20] Krassovitsky A.M., Ualiyeva I.M., Meirambekkyzy Z., Mussabayev R.R. Lexicon-based approach in generalization evaluation in Russian language media. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2018; 14(3):567-572. (In Russ., abstract in Eng.) DOI: 10.25559/SITITO.14.201803.567-572
[21] Kogalovsky M.R., Parinov S.I. Semantic Annotation of Information Resources by Taxonomies in Scientific Digital Library. In: CEUR Workshop Proceedings: Selected Papers of the XIX International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2017). Moscow, Russia, October 9-13, 2017, vol. 2022. 2017, pp. 301-310. Available at: http://ceur-ws.org/Vol-2022/paper47.pdf (accessed 15.08.2019). (In Russ., abstract in Eng.)
[22] Tsukanova Z.V. Strukturnye i semanticheskie osobennosti zagolovkov sovremennyh nauchnyh statej (na materiale russkogo i anglijskogo jazykov) [Structural and semantic features of the headings of modern scientific articles (by the material of Russian and English languages)]. Modern scientific researches and innovations. 2018; (5):33. Available at:
[23] Chapaykina N.E. Semanticheskij analiz tekstov. Osnovnye polozhenija [Semantic analysis of texts. Fundamentals]. Young Scientist. 2012; (5):112-115. Available at: https://elibrary.ru/item.asp?id=20470090 (accessed 15.08.2019). (In Russ.)
[24] Batura T.V. Metody i sistemy semanticheskogo analiza tekstov [Methods and systems of semantic text analysis]. Software Journal: Theory and Applications. 2016; (4). (In Russ.) DOI: 10.15827/2311-6749.21.220
[25] Bessmertny I.A. Knowledge visualization based on semantic networks. Programming and Computer Software. 2010; 36(4):197-204. (In Eng.) DOI:
[26] Ayusheeva N.N., Dikikh A.Yu. Model of constructing a semantic network of scientific text. Modern High Technologies. 2018; (6):9-13. Available at: https://www.elibrary.ru/item.asp?id=35197327 (accessed 15.08.2019). (In Russ., abstract in Eng.)
[27] Ustalov D.A., Sozykin A.V. A Software System for Automatic Construction of a Semantic Word Network. Bulletin of the South Ural State University. Series: Computational Mathematics and Software Engineering. 2017; 6(2):69-83. (In Russ., abstract in Eng.) DOI: 10.14529/cmse170205

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.