Symbolic Modeling of Messages and Information Extraction
Abstract
The review presents the basics of an updated approach to symbolic modeling (s-modeling) of audio, video, graphic, etc. messages and solving problems of information extraction. The importance of the tasks of s-modeling messages and information extraction is determined by their growing role in various types of activities. The dominance of s-models of messages in intellectual activity is determined not only by their compactness and expressiveness, but also by the absence of restrictions on the types of media used for their storage. The media can be the memory of a person, computer, smartphone, digital camera, etc. The costs of building, copying, transmitting, storing and accumulating s-models of messages (articles of electronic encyclopedias, textbooks and scientific journals, navigation maps, video messages, drawings of machines in computer-aided design systems, medical tomograms, recordings of musical compositions, etc.) are incomparably less than similar costs associated with non-symbolic models (models of ships, buildings, etc.). S-modeling of messages is considered as their mapping into the selected modeling environment, performed under specified constraints corresponding to the conditions for solving information extraction problems. The adequacy of s-models of messages is determined by the degree of their compliance with the tasks for which they were created, and the accuracy of the results obtained when solving these tasks. Updated definitions of the concepts of s-(symbol, code, signal, message, data, information) are proposed. Refined s-models of the problem, algorithm, program, concept system and knowledge system are considered. The task of extracting information from a message is considered as the task of interpreting a message on the s-model of a system of concepts. A brief review of the works of A.N. Kolmogorov and K. Shannon on the problems of message transmission is given. These works talk about the "amount of information" and consider the tasks associated with this concept. The concept of "information" (as a result of interpreting messages on models of concept systems) it is not considered there. An example of the application of the proposed approach to the analysis of the problem of translation from one language to another is given.
References
2. Califf M.E., Mooney R.J. Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction. Journal of Machine Learning Research. 2003; 4:177-210. Available at: https://www.jmlr.org/papers/volume4/califf03a/califf03a.pdf (accessed 10.09.2021). (In Eng.)
3. Siefkes C., Siniakov P. An Overview and Classification of Adaptive Approaches to Information Extraction. In: Spaccapietra S. (ed.) Journal on Data Semantics IV. Lecture Notes in Computer Science. Vol. 3730. Springer, Berlin, Heidelberg; 2005. p. 172-212. (In Eng.) doi: https://doi.org/10.1007/11603412_6
4. Altınel B., Ganiz M.C. Semantic text classification: A survey of past and recent advances. Information Processing & Management. 2018; 54(6):1129-1153. (In Eng.) doi: https://doi.org/10.1016/j.ipm.2018.08.001
5. Vo D.-T., Al-Obeidat F., Bagheri E. Extracting temporal and causal relations based on event networks. Information Processing & Management. 2020; 57(6): 102319. (In Eng.) doi: https://doi.org/10.1016/j.ipm.2020.102319
6. Zhang N., et al. Contrastive Information Extraction With Generative Transformer. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2021; 29:3077-3088. (In Eng.) doi: https://doi.org/10.1109/TASLP.2021.3110126
7. Xu R., et al. Joint Extraction of Retinal Vessels and Centerlines Based on Deep Semantics and Multi-Scaled Cross-Task Aggregation. IEEE Journal of Biomedical and Health Informatics. 2021; 25(7):2722-2732. (In Eng.) doi: https://doi.org/10.1109/JBHI.2020.3044957
8. Liu X., Cheng J., Zhang Q. Multi-Stream Semantics-Guided Dynamic Aggregation Graph Convolution Networks to Extract Overlapping Relations. IEEE Access. 2021; 9:41861-41875. (In Eng.) doi: https://doi.org/10.1109/ACCESS.2021.3062231
9. Abdollahi A., Pradhan B., Alamri A. VNet: An End-to-End Fully Convolutional Neural Network for Road Extraction From High-Resolution Remote Sensing Data. IEEE Access. 2020; 8:179424-179436. (In Eng.) doi: https://doi.org/10.1109/ACCESS.2020.3026658
10. Yu X., et al. LSTM-Based End-to-End Framework for Biomedical Event Extraction. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2020; 17(6):2029-2039. (In Eng.) doi: https://doi.org/10.1109/TCBB.2019.2916346
11. Wang J., Song J., Chen M., Yang Z. Road network extraction: a neural-dynamic framework based on deep learning and a finite state machine. International Journal of Remote Sensing. 2015; 36(12):3144-3169. (In Eng.) doi: https://doi.org/10.1080/01431161.2015.1054049
12. Wang R., Zhang W., Shi W., Wang X., Cao W. GA-ORB: A New Efficient Feature Extraction Algorithm for Multispectral Images Based on Geometric Algebra. IEEE Access. 2019; 7:71235-71244. (In Eng.) doi: https://doi.org/10.1109/ACCESS.2019.2918813
13. Baviskar D., Ahirrao S., Potdar V., Kotecha K. Efficient Automated Processing of the Unstructured Documents Using Artificial Intelligence: A Systematic Literature Review and Future Directions. IEEE Access. 2021; 9:72894-72936. (In Eng.) doi: https://doi.org/10.1109/ACCESS.2021.3072900
14. Ilyin A.V., Ilyin V.D. Towards a Normalized Economic Mechanism Based on E-services. Agris On-line Papers in Economics and Informatics. 2014; (3):39-49. Available at: https://online.agris.cz/archive/2014/03/04 (accessed 10.09.2021). (In Eng.)
15. Newell A., Simon H. Computer science as empirical inquiry: symbols and search. Communications of the ACM. 1976; 19(3):113-126. (In Eng.) doi: https://doi.org/10.1145/360018.360022
16. Ilyin V.D. Symbolic Modeling (S-Modeling): an Introduction to Theory. In: Silhavy R. (ed.) Artificial Intelligence Trends in Systems. CSOC 2022. Lecture Notes in Networks and Systems. Vol. 502. Springer, Cham; 2022. (In Eng.) doi: https://doi.org/10.1007/978-3-031-09076-9_54
17. Cerf V., Kahn R. A Protocol for Packet Network Intercommunication. IEEE Transactions on Communications. 1974; 22(5):637-648. (In Eng.) doi: https://doi.org/10.1109/TCOM.1974.1092259
18. Ilyin A.V., Ilyin V.D. Interval Planning the Supplies of Scarce Product. Contemporary Engineering Sciences. 2015; 8(31):1495-1498. (In Eng.) doi: https://doi.org/10.12988/ces.2015.59263
19. Zhyrnov V., Solonskaya S. Metod preobrazovaniya simvol’nykh radarnykh otmetok malozametnykh podvizhnykh ob”yektov na osnove effekta Tal’bota [Method for transforming symbolic radar marks of low-noticeable moving objects based on the Talbot effect]. Radiotekhnika: All-Ukr. Sci. Interdep. Mag. No. 205. KNURE, Kharkiv; 2021. p. 129-137. (In Russ., abstract in Eng.) doi: https://doi.org/10.30837/rt.2021.2.205.14
20. Shvalov D.V., Kravchenko V.A., Shirapov D.Sh. Automated Logic-Mathematical Modeling of Railway Automation Devices Technical Condition. 2019 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon). IEEE Press, Vladivostok, Russia; 2019. p. 1-7. (In Eng.) doi: https://doi.org/10.1109/FarEastCon.2019.8934943
21. Kravchenko V.A., Shirapov D.Sh. Logic-Functional Modeling of Nonlinear Radio Engineering Systems. 2018 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon). IEEE Press, Vladivostok, Russia; 2018. p. 1-6. (In Eng.) doi: https://doi.org/10.1109/FarEastCon.2018.8602769
22. Ilyin V.D., Sokolov I.A. Informatsiya kak rezul'tat interpretatsii soobshcheniy na simvol'nykh modelyakh sistem ponyatiy [Information as a result of message interpretation based on s-model of systems of concepts]. Informacionnye tekhnologii I I vichslitel’nye sistemy = Journal of Information Technologies and Computing Systems. 2006; (4):74-82. Available at: https://www.elibrary.ru/item.asp?id=12830934 (accessed 10.09.2021). (In Russ., abstract in Eng.)
23. Ilyin A.V., Ilyin V.D. The Technology of Interactive Resource Allocation in Accordance with the Customizable System of Rules. Applied Mathematical Sciences. 2013; 7(143):7105-7111. (In Eng.) doi: http://dx.doi.org/10.12988/ams.2013.311649
24. Shannon C.E. A Mathematical Theory of Communication. The Bell System Technical Journal. 1948; 27(3):379-423. ( In Eng.) doi: https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
25. Kolmogorov A.N. Tri podkhoda k opredeleniyu ponyatiya "Kolichestvo informatsii" [Three approaches to the definition of the concept "quantity of information"]. Problemy peredaсhi informatsii = Problems of Information Thransmission. 1965; 1(1):3-11. Available at: http://mi.mathnet.ru/rus/ppi/v1/i1/p3 (accessed 10.09.2021). (In Russ.)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.