Algorithm for Resolving the Ambiguity of Author Names in IAS ISTINA

Abstract

An important task for scientometric data collection and processing systems is to identify authors of publications based on bibliographic data. This task is an important citation analysis system for publications. Information in such systems is automatically collected from various sources. It is necessary to automatically process the collected bibliographic data. Such a task is important for scientometric systems that analyze numerous types of scientific products (publications, dissertations, patents, lectures, etc.) too. The accuracy of the author's definition affects the quality of the results of scientometric assessment of the scientific activity of subjects. In addition, the accuracy of determining the author is important for protecting systems that use modern models of logical access control.
The article describes an algorithm developed to solve this problem, which is currently used in the scientometric system IAS ISTINA. This system has been used since 2012 to collect data on scientific activities at Moscow State University. M.V. Lomonosov. The system processes data on scientific publications, pedagogical activity, research, dissertations, participation in various councils, obtaining scientific awards and many other indicators. Currently, IAS ISTINA is used in more than twenty organizations.
The main feature of this algorithm is the use of a co-authorship graph to identify the author. It is built for publications and other results of scientific activity. In this article id described the structure of the bibliographic data analysis module, which is implemented in the IAS ISTINA system, and the algorithm for identifying authors based on bibliographic data of the publication. The results of testing of the algorithm are presented at the end of the article. It demonstrates the high accuracy of algorithm operation.

Author Biographies

Alexander Sergeevich Kozitsin, Lomonosov Moscow State University

Leading Researcher of the Institute of Mechanics Lomonosov Moscow State University, Ph.D. (Phys.-Math.)

Sergey Alexandrovich Afonin, Lomonosov Moscow State University

Leading Researcher of the Institute of Mechanics Lomonosov Moscow State University, Ph.D. (Phys.-Math.)

References

[1] Nalimov V.V., Mulchenko Z.M. Naukometriia. Izucheniia razvitiia nauki kak informatsionnogo protsessa [Scientometrics. Study of the development of science as an information process] Moscow, Nauka Publ.; 1969. (In Russ.).
[2] Kirillova O.V. How to arrange an article and scientific journal to avoid indexing errors in international scientometric databases. Science Editor and Publisher. 2018; 3(1-2):52-72. (In Russ., abstract in Eng.) DOI: https://doi.org/10.24069/2542-0267-2018-1-2-52-72
[3] Pislyakov V.V. Why Create National Citation Indexes? Scientific and Technical Libraries. 2007; (2):65-72. Available at: https://elibrary.ru/item.asp?id=9548771& (accessed 21.1.2020). (In Russ.).
[4] Eremenko G.O., ELIBRARY.RU: Course to Improve the Quality of Content. Universitetskaia kniga. 2016; (3):62-68. Available at: https://elibrary.ru/item.asp?id=25721733 (accessed 21.1.2020). (In Russ.).
[5] Guskov A.E. Russian Scientometrics: A Review of Researches. Bibliosphere. 2015; (3):75-86. Available at: https://elibrary.ru/item.asp?id=24100709 (accessed 21.1.2020). (In Russ., abstract in Eng.).
[6] Motroshilova N.V. Inferior Segments of the Scientometrics. Vestnik Rossijskoj Akademii Nauk. 2011; 81(2):134-146. Available at: https://elibrary.ru/item.asp?id=16228786 (accessed 21.1.2020). (In Russ., abstract in Eng.).
[7] Bogatov V.V., Syroezhkina D.S. Scientific Collaboration as an Element of Science Infrastructure. Science. Innovation. Education. 2016; 11(4):30-44. Available at: https://elibrary.ru/item.asp?id=28123076 (accessed 21.1.2020). (In Russ., abstract in Eng.).
[8] Sadovnichy V.A. Vasenin V.A. Intellectual System of Thematic Investigation of Scientometrical Data: Background of Creation and Methodology of Development. Software Engineering. 2018; 9(2):51-58. (In Russ., abstract in Eng.) DOI: https://doi.org/10.17587/prin.9.51-58
[9] Vasenin V.A., Zenzinov A.A., Lunev K.V. The Usage of CRIS-systems for the Contest Procedures Automation in Terms of the ISTINA Information System. Software Engineering. 2016; 7(10):472-480. (In Russ., abstract in Eng.) DOI: https://doi.org/10.17587/prin.7.472-480
[10] Marshakova-Shaikevich I.V. Thematic Spectrum of Research Activity in Russia. Vestnik Rossijskoj Akademii Nauk. 2007; 77(9):811-818. Available at: https://elibrary.ru/item.asp?id=9552050 (accessed 21.1.2020). (In Russ.).
[11] Vasenin V.A., Afonin S.A., Zanchurin M.A., Zenzinov A.A., Kozitsin A.S., Korshunov A.A., Krivchikov M.A., Shachnev D.A. Intellectual System of Thematic Investigation of Scientometrical Data: State and Prospects. In: Proceedings of the International conference "Knowledge – Ontology – Theories" (KONT-2019). IM SB RAS, Novosibirsk; 2019. p. 94-103. Available at: https://elibrary.ru/item.asp?id=42432021 (accessed 21.1.2020). (In Russ.).
[12] Jin X., Krishnan R., Sandhu R. A Unified Attribute-Based Access Control Model Covering DAC, MAC and RBAC. In: Cuppens-Boulahia N., Cuppens F., Garcia-Alfaro J. (ed.) Data and Applications Security and Privacy XXVI. DBSec 2012. Lecture Notes in Computer Science. 2012; 7371:41-55. Springer, Berlin. (In Eng.) DOI: https://doi.org/10.1007/978-3-642-31540-4_4
[13] Sandhu R.S., Samarati P. Access control: principle and practice. IEEE Communications Magazine. 1994; 32(9):40-48. (In Eng.) DOI: https://doi.org/10.1109/35.312842
[14] Devyanin P.N. Modeli bezopasnosti komp'juternyh sistem [Security model of computer systems] Moscow: Publ. Center Academy; 2005. (In Russ.).
[15] Gaydamakin N.A. Razgranichenie dostupa k informatsii v komp’yuternykh sistemakh [Differentiation of Access to Information in Computer Systems] Ekaterinburg, USU Publ.; 2003. (In Russ.) .
[16] Platonov A.V., Poleschuk E.A. Methods of automatic ontology construction. Software & Systems. 2016; (2):47-52. (In Russ., abstract in Eng.) DOI: https://doi.org/10.15827/0236-235X.114.047-052
[17] Bubareva O.A. Reseach of Mechanisms of Automatic Construction of Ontologies over Multiple Unstructured Data. South-Siberian Scientific Bulletin. 2019; (1):77-82. (In Russ., abstract in Eng.) DOI: https://doi.org/10.25699/SSSB.2019.25.27609
[18] Servos D., Osborn S.L. Current Research and Open Problems in Attribute-Based Access Control. ACM Computing Surveys. 2017; 49(4):65. (In Eng.) DOI: https://doi.org/10.1145/3007204
[19] Servos D., Osborn S.L. HGABAC: Towards a Formal Model of Hierarchical Attribute-Based Access Control. In: Cuppens F., Garcia-Alfaro J., Heywood N.Z., Fong P.W.L. (ed.) Foundations and Practice of Security. FPS 2014. Lecture Notes in Computer Science. 2015; 8930:187-204. Springer, Cham. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-17040-4_12
[20] Narouei M., Takabi H., Nielsen R. Automatic Extraction of Access Control Policies from Natural Language Documents. IEEE Transactions on Dependable and Secure Computing. 2020; 17(3):506-517. (In Eng.) DOI: https://doi.org/10.1109/TDSC.2018.2818708
[21] Afonin S. Ontology Models for Access Control Systems. In: 2018 3rd Russian-Pacific Conference on Computer Technology and Applications (RPC), Vladivostok; 2018. pp. 1-6. (In Eng.) DOI: https://doi.org/10.1109/RPC.2018.8482178
[22] Vasenin V.A., Itkes A.A. Using Relation-Based Access Control Model within Django-Based Web Application. Software Engineering. 2018; 9(5):195-208. (In Russ., abstract in Eng.) DOI: https://doi.org/10.17587/prin.9.195-208
[23] Vasenin V.A., Zanchurin M.A., Kozitsin A.S., Krivchikov M.A., Shachnev D.A. Architectural and Technological Aspects of the Development and Maintenance of Large Information Analysis Systems in the Area of Science and Education. Software Engineering. 2017; 8(10):448-455. (In Russ., abstract in Eng.) DOI: https://doi.org/10.17587/prin.8.448-455
[24] Vasenin V.A., Afonin S.A., Kozitsin A.S., Golomazov D.D., Bahtin A.V., Gankin G.M. Intelligent System for Case Study of Scientific and Technical Information (ISTINA). Obozrenie prikladnoi i promyshlennoi matematiki = Review of Applied and Industrial Mathematics. 2012; 19(2):239-240. Available at: https://istina.msu.ru/publications/article/813649 (accessed 21.1.2020). (In Russ.).
[25] Vasenin V., Lunev K., Afonin S., Shachnev D. Methods for Intelligent Data Analysis Based on Keywords and Implicit Relations: The Case of "ISTINA" Data Analysis System. In: 2019 Actual Problems of Systems and Software Engineering (APSSE), Moscow, Russia; 2019. p. 157-161. (In Eng.) DOI: https://doi.org/10.1109/APSSE47353.2019.27
[26] Kozitsin A., Afonin S., Shachnev D. Determination of thematic proximity of scientific journals and conferences using big data technologies. In: Gorbunov-Posadov M., Elizarov A., Yakobovskiy M. (ed.) Proceedings of the 21st Conference on Scientific Services & Internet (SSI-2019). CEUR Workshop Proceedings. 2020; 2543:407-413. Novorossiysk-Abrau, Russia. Available at: http://ceur-ws.org/Vol-2543/spaper12.pdf (accessed 21.1.2020). (In Eng.).
[27] Vasenin V.A., Gaspariants A.E. Author Name Disambiguation: Analysis of Publications. Software Engineering. 2017; 8(6):264-275. (In Russ., abstract in Eng.) DOI: https://doi.org/10.17587/prin.8.264-275
[28] Afonin S.A., Gaspariants A.E. Scientific Article Authorship Disambiguation for Automated Bibliographic Records Processing. Software Engineering. 2014; (1):25-28. Available at: https://elibrary.ru/item.asp?id=21431817 (accessed 21.1.2020). (In Russ., abstract in Eng.).
[29] Afonin S.A., Gaspariants A.E. Construction of Quality Function for Scientific Papers Author Names Disambiguation Problem Using Supervised Learning Techniques. Software Engineering. 2015; (10):31-37. Available at: https://elibrary.ru/item.asp?id=24365410 (accessed 21.1.2020). (In Russ., abstract in Eng.).
[30] Kozitsin A.S., Afonin S.A. The Resolution of Ambiguities in the Identification of Authors of the Publication with the Use of Co-Authors' Graphs in Large Collections of Bibliographic Data. Software Engineering. 2017; 8(12):556-562. (In Russ., abstract in Eng.) DOI: https://doi.org/10.17587/prin.8.556-562
Published
2020-05-25
How to Cite
KOZITSIN, Alexander Sergeevich; AFONIN, Sergey Alexandrovich. Algorithm for Resolving the Ambiguity of Author Names in IAS ISTINA. Modern Information Technologies and IT-Education, [S.l.], v. 16, n. 1, p. 108-117, may 2020. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/600>. Date accessed: 02 june 2025. doi: https://doi.org/10.25559/SITITO.16.202001.108-117.
Section
Research and development in the field of new IT and their applications