Алгоритм разработки и создания цифрового академического словарного корпуса русского языка

Аннотация

В данной статье рассматриваются академические толковые словари русского языка /РЯ/ (прежде всего, БАС, СРНГ, МАС, СРЯ 1895-1937) как ядро (основа) академического словарного корпуса (АСК) [11, с.213-214; 13, с.25-28; 22, с.226-257; 27, с.76-83; 30, с.98-102] РЯ, а также алгоритм, концепция и принципы разработки и создания АСК РЯ. АСК конструируется  в форме ГИЗАУРУСА (гипертекстового тезауруса) [20, с.280-281] в нелинейной форме с учетом реляционных, иерархических и сетевых парадигматических связей, что позволит посредством соответствующей классификации и систематизации, дигитализации и ретродигитизации академических толковых словарей РЯ структурировать и объединить лексикографические материалы, обеспечив тем самым  их оперативный ввод в научный оборот с целью оптимизации научных исследований в современной лексикографии.
Научная значимость АСК определяется необходимостью централизованного описания русской лексики, так как многочисленные разработки в исследуемой области каждая в отдельности имеют собственные задачи и направление исследования, что не позволяет увидеть полную языковую картину РФ, которую может показать информационно-поисковый /ИП/ АСК. Для этой цели предполагается смоделировать, разработать и создать работоспособный интерактивно пополняемый ИП АСК РЯ, позволяющий объединить в единую базу различные лексикографические источники.

Сведения об авторах

Sergey Vladimirovich Lesnikov, Институт лингвистических исследований Российской академии наук

ведущий специалист (программист) отдела лексикографии современного русского языка, группа Большого академического словаря, кандидат филологических наук, доцент

Alexander Vladimirovich Lesnikov, Московский государственный университет имени М.В. Ломоносова

инженер Альманаха «ГОВОР», механико-математический факультет

Gleb Sergeevich Lesnikov, Северный государственный медицинский университет

модератор Альманаха «ГОВОР»

Alena Mikhailovna Farina, Сыктывкарский государственный университет имени Питирима Сорокина

редактор Альманаха «ГОВОР»

Литература

[1] Andryushchenko V.M. Machine Fund of the Russian language: ideas and judgments. Concept and architecture of the Machine Fund of the Russian language. Nauka, Moscow, 1986; 32. (In Russ.)
[2] Andryushchenko V.M. The Concept and architecture of the machine Fund of the Russian language: Diss. Dr. Sci. (Philology). Nauka, Moscow, 1988. 360 pp. (In Russ.)
[3] Andryushchenko V.M. Kontseptsiya i arkhitektura Mashinnogo fonda russkogo yazyka [Concept and Architecture of the Computer Fund of Russian Language]. Moscow, Nauka Publ., 1989, 196 pp. (In Russ.)
[4] Andryushchenko V.M. Machine Fund of the Russian language: Integration approach. M.: VINITI, 1989. 80 pp. (In Russ.)
[5] Bulygina D.S., Lesnikov S.V. Algoritm avtomatizirovannogo konstruirovaniya gipertekstovogo tezaurusa (gizaurusa) russkogo yazyka na osnove otsifrovannykh slovarey i spravochnikov novykh slov i znacheniy dlya interaktivnogo leksikograficheskogo korpusa "Leksiko-semanticheskaya neologiya v russkom yazyke nachala XXI veka" [Algorithm for automated construction of a hypertext thesaurus (hysaurus) of the Russian language based on digitized dictionaries and reference books of new words and meanings for the interactive lexicographic corpus “Lexical and semantic neology in Russian language of the beginning of the 21st century”]. Sbornik statey 9 Mezhdunarodnogo nauchno-issledovatel’skogo konkursa “Dostizheniya vuzovskoy nauki 2019”: v 2 ch. [Proceedings of 9th International Scientific and Research Competition “Achievements of the University Science 2019”: in 2 pts]. Penza, International Center for Scientific Cooperation “Nauka i Prosveshcheniye” Publ. 2019; 1:19-125. (In Russ.)
[6] The second All-Union conference on the creation of the machine Fund of the Russian language: Reports. M. B. I., 1987. 248 pp. (In Russ.)
[7] The second All-Union conference on the creation of the machine Fund of the Russian language: (TEZ. Doc.). M. : B. I., 1987. 182 pp. (In Russ.)
[8] Ershov A.P. Methodological prerequisites of productive dialogue with computers in natural language. Voprosy Filosofii. 1981; 8:109-119. (In Russ.)
[9] Zagorovskaya O.V., Lesnikov S.V. Vidy leksikograficheskoy informatsii v avtomaticheskom slovare russkikh govorov Komi ASSR i sopredel’nykh oblastey [Types of lexicographic information in the automatic dictionary of Russian patois of the Komi Autonomous Soviet Socialist Republic and adjacent regions]. Mashinnyy fond russkogo yazyka: Predproyektnyye issledovaniya [Computer Fund of Russian Language: Predesign Research]. Moscow, Russian Language Institute of the Academy of Sciences of the USSR Publ. 1988; 64-70. (In Russ.)
[10] Instructions for compiling the "Dictionary of modern Russian literary language" (in fifteen volumes). M.-L.: USSR Academy of Sciences, 1958. 87 pp. (In Russ.)
[11] Karaulov Y.N. Aktivnaya grammatika i assotsiativno-verbal’naya set’ [Active Grammar and AssociativeVerbal Network]. Moscow, Russian Language Institute of RAS Publ., 1999. 180 pp. (In Russ.)
[12] Lesnikov S.V. Computer-based Russian word retrieval reference system. Problemy istorii, filologii, kul’tury = Journal of Historical, Philological and Cultural Studies. 2009; 2(24):622- 630. Available at: https://elibrary.ru/item.asp?id=16863201 (accessed 12.05.2019). (In Russ., abstract in Eng.)
[13] Lesnikov S.V. Akademicheskiye tolkovyye slovari russkogo yazyka kak yadro akademicheskogo slovarnogo korpusa russkogo yazyka [Academic explanatory dictionaries of the Russian language as the core of the academic vocabulary of the Russian language]. Sbornik nauchnykh statey po itogam raboty Mezhdunarodnogo nauchnogo foruma “Nauka i innovatsii: sovremennyye kontseptsii” [Proceedings of the International Scientific Forum “Science and Innovations: Contemporary Concepts”]. Moscow, Infinity Publ. 2019; 1:38-47. (In Russ.)
[14] Lesnikov S.V. Akademicheskiy slovarnyy korpus (ASK) russkogo yazyka [Academic vocabulary corpus (AVC) of the Russian language]. Materialy 6 Mezhdunarodnogo kongressa issledovateley russkogo yazyka “Russkiy yazyk: istoricheskiye sud’by i sovremennost” [Proceedings of the 6th International Congress of Russian Language Researchers “Russian Language: Historical Fate and Modernity”]. Moscow, Lomonosov Moscow State University Publ. 2019; 213-214. (In Russ.)
[15] Lesnikov S.V. Akademicheskiy slovarnyy korpus (ASK) russkogo yazyka [Academic vocabulary corpus (AVC) of the Russian language]. Slovo i slovar’ = Vocabulum et vocabularium [Word and Dictionary = Vocabulum et vocabularium]. 2019; 16:111-114. (In Russ.)
[16] Lesnikov S.V. Analiz paradigmaticheskikh otnosheniy lingvisticheskoy terminosistemy [Analysis of the paradigmatic relations of the linguistic term system]. Pamyati Anatoliya Anatol’yevicha Polikarpova [In Memory of Anatoly Anatolyevich Polikarpov]. Moscow, Lomonosov Moscow State University Publ. 2015; 269-279. (In Russ.)
[17] Lesnikov S.V. Analiticheskiy referativno-annotirovannyy obzor otsifrovannykh slovarey i spravochnikov novykh slov i znacheniy dlya tsifrovogo leksikograficheskogo korpusa «Leksiko-semanticheskaya neologiya v russkom yazyke nachala XXI veka [Analytical abstract annotated review of digitized dictionaries and reference books of new words and meanings for the digital lexicographic corpus “Lexical and semantic neology in Russian language in the beginning of the 21st century]. Sbornik nauchnykh statey po itogam raboty Mezhdunarodnogo nauchnogo foruma “Nauka i innovatsii: sovremennyye kontseptsii” [Proceedings of the International Scientific Forum “Science and Innovations: Contemporary Concepts”]. Moscow, Infinity Publ. 2019; 3:34-42. (In Russ.)
[18] Lesnikov S.V. Arkhitektura i sut’ informatsionno-poiskovogo korpusa akademicheskikh slovarey russkogo yazyka [Architecture and essence of the information-retrieval corpus of academic dictionaries of the Russian language]. Nauchnyy obozrevatel’ = Scientific Reviewer. 2019; 3(99):25-28. (In Russ).
[19] Lesnikov S.V. Basic blocks of an automated lexicographic system. Vestnik Chelyabinskogo gosudarstvennogo universiteta = Bulletin of Chelyabinsk State University. 2011; 33(248):200-202. Available at: https://elibrary.ru/item.asp?id=17799038 (accessed 12.05.2019). (In Russ.)
[20] Lesnikov S.V. The basic operators of the search queries thesaurus metalanguage of linguistics. V mire nauchnykh otkrytiy = Siberian Journal of Life Sciences and Agriculture. 2012; 7-2(31):39-53. Available at: https://elibrary.ru/item.asp?id=17920312 (accessed 12.05.2019). (In Russ., abstract in Eng.)
[21] Lesnikov S.V. Vladislav Mitrofanovich Andryushchenko – nauchnyy rukovoditel’ i konsul’tant, glavnyy konstruktor Mashinnogo fonda russkogo yazyka (MFRYA) [Vladislav Mitrofanovich Andryushchenko – scientific advisor and consultant, chief designer of the Computer Fund of Russian language (CFRL)]. Tezisy vserossiyskoy konferentsii “Ot yazykovykh mashinnykh fondov k lingvisticheskim korpusam: pamyati V.M. Andryushchenko” [Proceedings of the All-Russian Conference “From Linguistic Computer Funds to Linguistic Corpus: to Memory of V.M. Andryushchenko”]. Moscow, Lomonosov Moscow State University Publ., Russian Language Institute of RAS Publ. 2018; 58-60. (In Russ.)
[22] Lesnikov S.V. Hypertext information retrieval thesaurus (hesaurus) “a meta-language of science” (structure; mathematical, linguistic and software; topics linguistics, mathematics, Economics). Russian language: its historical destiny and present state. Moscow State University. 2014. 268-269. (In Russ.)
[23] Lesnikov S.V. Hypertext thesaurus of science metalanguage. Problemy istorii, filologii, kul’tury = Journal of Historical, Philological and Cultural Studies. 2011; 3(33):С. 30-34. Available at: https://elibrary.ru/item.asp?id=17072389 (accessed 12.05.2019). (In Russ., abstract in Eng.)
[24] Lesnikov S.V. K voprosu o soderzhanii slovarnoy stat’i tezaurusa metayazyka lingvistiki [To the issue of dictionary entry content of the metalanguage thesaurus of linguistics]. Obydennoye metayazykovoye soznaniye: ontologicheskiye i gnoseologicheskiye aspekty [Ordinary Metalinguistic Consciousness: Ontological and Epistemological Aspects]. Kemerovo, Kemerovo State University Publ. 2012; 4:190-203. (In Russ.)
[25] Lesnikov S.V. Gipertekstovyy informatsionno-poiskovyy tezaurus (gizaurus) "Metayazyk nauki" (struktura; matematicheskoye, lingvisticheskoye i programmnoye obespecheniya; razdely lingvistika, matematika, ekonomika) [Hypertext information retrieval thesaurus (hysaurus) “Metalanguage of science” (structure; mathematical, linguistic and software provision; sections of linguistics, mathematics, economics)]. Materialy 5 Mezhdunarodnogo kongressa issledovateley russkogo yazyka “Russkiy yazyk: istoricheskiye sud’by i sovremennost” [Proceedings of the 5th International Congress of Russian Language Researchers “Russian Language: Historical Fate and Modernity”]. Moscow, Lomonosov Moscow State University Publ. 2014; 268-269. (In Russ.)
[26] Lesnikov S.V. Konstruirovaniye gipertekstovogo svoda leksiki narodnykh govorov russkogo yazyka [Constructing a hypertext corpus of folk patois vocabulary of the Russian language]. Materialy mezhdunarodnoy konferentsii “Aktual’nyye problemy russkoy dialektologii” [Proceedings of the International Conference “Current Issues of Russian Defectology”]. Moscow, V.V. Vinogradov Russian Language Institute of RAS Publ. 2018; 148-149. (In Russ.)
[27] Lesnikov S.V. Konstruirovaniye informatsionno-poiskovogo svoda akademicheskikh slovarey russkogo yazyka (Svod ASRYA) [Constructing an information retrieval corpus of academic dictionaries of the Russian language (ADRL corpus)]. Leksicheskiy atlas russkikh narodnykh govorov (Materialy i issledovaniya) [Lexical Atlas of Russian National Patois (Materials and Studies)]. St. Petersburg, Institute of Linguistic Studies of the Russian Academy of Sciences Publ. 2018; 226-257. (In Russ.)
[28] Lesnikov S.V. Konstruirovaniye slovarya terminov metayazyka SMI s pomoshch’yu metodiki vychisleniya vesa bazisnykh terminov [Constructing a terms dictionary of mass media metalanguage using the method of calculating the weight of basic terms]. Materyyaly 4 Mizhnar. navuk.-prakt. kanf., prysvyech. 90-hoddzyu z dnya naradzhennya d-ra filal. navuk praf. A.I. Narkyevicha “Slova w kantekstsye chasu” [Proceedings of the 4th International Scientific and Practical Conference Dedicated to the 90th Anniversary of Doctor of Philology, Professor A.I. Narkevich “Word in a Context of Time”]. Mіnsk, Belarusian State University Publ., 2019, pp. 66-69. Available at: https://elibrary.ru/item.asp?id=37143924 (accessed 12.05.2019). (In Russ.)
[29] Lesnikov S.V. Konstruirovaniye slovnika slovarya terminov metayazyka lingvistiki s pomoshch’yu metodiki vychisleniya vesa bazisnykh terminov metayazyka lingvistiki [Constructing a glossary of terms vocabulary of linguistics metalanguage using the method of calculating the weight of the basic terms of linguistics metalanguage]. Sotsial’no-kognitivnoye funktsionirovaniye yazyka [Social and Cognitive Functioning of the Language]. Kemerovo, Kemerovo State University Publ. 2017; 155-170. (In Russ.)
[30] Lesnikov S.V. Modeling the metalanguage thesaurus of linguistics-based hypertext frames. Vestnik Vyatskogo gosudarstvennogo gumanitarnogo universiteta = Herald of Vyatka State University. 2011; 3(2):51-54. Available at: https://elibrary.ru/item.asp?id=17567363 (accessed 12.05.2019). (In Russ., abstract in Eng.)
[31] Lesnikov S.V. Key terms and Latin term elements of linguistics metalanguage. Nauchnyye vedomosti Belgorodskogo gosudarstvennogo universiteta. Seriya: Gumanitarnyye nauki = Belgorod State University Scientific Bulletin. Humanities. 2011; 12(107):37-45. Available at: https://elibrary.ru/item.asp?id=17298245 (accessed 12.05.2019). (In Russ., abstract in Eng.)
[32] Lesnikov S.V. Predposylki konstruirovaniya i bazovyye pervoistochniki akademicheskogo slovarnogo korpusa russkogo yazyka [Prerequisites for the constuction and basic sources of the academic vocabulary of the Russian language]. Sbornik nauchnykh statey po itogam raboty Mezhdunarodnogo nauchnogo foruma “Nauka i innovatsi: sovremennyye kontseptsii” [Proceedings of the International Scientific Forum “Science and Innovations: Contemporary Concepts”]. Moscow, Infinity Publ. 2019; 2:76-83. (In Russ.)
[33] Lesnikov S.V. Slovar’ russkikh slovarey [Dictionary of the Russian Dictionaries]. Moscow, Azbukovnik Publ., 2002. 334 pp. (In Russ.)
[34] Lesnikov S.V. Typology of Russian dictionaries of linguistic terminology. Mir nauki, kul’tury, obrazovaniya = The World of Science, Culture and Education. 2011; 6-2(31):6-10. Available at: https://elibrary.ru/item.asp?id=18155364 (accessed 12.05.2019). (In Russ., abstract in Eng.)
[35] Lesnikov S.V. Formirovaniye terminologicheskogo fonda russkogo yazyka [Terms fund of the Russian language development]. Materialy mezhdunarodnoy nauchno-prakticheskoy konferentsii “Nauka segodnya: vyzovy i resheniya” [Proceedings of the International Scientific and Practical Conference “Science Today: Challenges and Solutions”]. Vologda, LLC “Marker” Publ. 2019; 98-102. (In Russ.)
[36] Lesnikov S.V. Fragment slovarya bazovykh terminov metayazyka lingvistiki [Fragment of the dictionary of basic terms of linguistics metalanguage]. Leksicheskiy atlas russkikh narodnykh govorov (Materialy i issledovaniya) [Lexical Atlas of Russian National Patois (Materials and Studies)]. St. Petersburg. 2017; 335-360. (In Russ.)
[37] Lesnikov S.V. Frame construction of thesaurus of linguistics metalanguage. Vestnik Pomorskogo universiteta. Seriya: Gumanitarnyye i sotsial’nyye nauki = Vestnik of Northern (Arctic) Federal University. Series "Humanitarian and Social Sciences". 2011; 4:84-88. Available at: https://elibrary.ru/item.asp?id=16996432 (accessed 12.05.2019). (In Russ., abstract in Eng.)
[38] Lesnikov S.V., Zagorovskaya O.V. Formal’naya grammatika slovarnoy stat’i avtomaticheskogo slovarya russkikh govorov Komi ASSR i sopredel’nykh oblastey (ASRGKA) [Formal grammar of dictionary entry of automatic dictionary of Russian patois of the Komi Autonomous Soviet Socialist Republic and adjacent regions (ADRPKA)]. Materialy 2 Vsesoyuznoy konferentsii po sozdaniyu Mashinnogo fonda russkogo yazyka .[Proceedings of the 2nd All-Russian Conference on Development of Computer Fund of Russian language]. Moscow, Russian Language Institute of the Academy of Sciences of the USSR Publ. 1988; 107-119. (In Russ.)
[39] Lesnikov S.V., Latkin S.A. Program module "Optimal arrangement" of object-oriented package of applications. Tezisy desyatoj Komi respublikanskoj molodezhnoj nauchnoj konferencii [Theses of the tenth Komi Republican youth scientific conference]. Syktyvkar: Komi branch of the USSR Academy of Sciences, Komi regional Komsomol Committee, Komi Regional Council NTO. 1987; 134-135. (In Russ.)
[40] Materials of III All-Union Conference on the establishment of the Machine Fund of Russian language. S.F. Gilyazova, Yu.N. Karaulov (eds). Moscow State University, 1990. 146 pp. (In Russ.)
[41] Andryushchenko V.M. Mashinnyy fond russkogo yazyka: idei i suzhdeniya [Computer Fund of Russian Language: Ideas and Judgments]. Kontseptsiya i arkhitektura Mashinnogo fonda russkogo yazyka [Concept and Architecture of the Computer Fund of Russian Language]. Moscow, Nauka Publ. 1986; 26-44. (In Russ.)
[42] Machine Fund of the Russian language: pre-project studies. M.: B. I., 1988. 294 pp. (In Russ.)
[43] Project dictionary of modern Russian literary language. M.-L.: USSR Academy of Sciences, 1938. 98 pp. (In Russ.)
[44] The third All-Union conference on the creation of the machine Fund of the Russian language: TEZ. Doc. Part 1. Moscow: B. I., 1989. 207 pp. V. 2. Moscow: B. I., 1989. 158 pp. (In Russ.)
[45] Shcherba L.V. Opyt obshchey teorii leksikografii [Experience of the general lexicography theory]. Izvestiya AN SSSR. Otdelenie literatury i yazyka [News of Academy of Sciences of the USSR. Language and Literature Department]. 1940, no. 3. (In Russ.) (Revised ed.: Shcherba L.V. Opyt obshchey teorii leksikografii [Experience of the general lexicography theory]. Yazykovaya sistema i rechevaya deyatel’nost’ [Linguistic System and Speech Activity]. Leningrad, 1974. (In Rus.))
[46] Bush V. As We May Think (Life Magazine 9-10-1945). 1945; 112-124. (In Eng.)
[47] Chomsky N. Tree models for the description of language. IRE Trans of Inform. Theory. IT-2. 1956; 113-124. (In Eng.)
[48] Chomsky N. On the certain formal properties of grammars. Information and Control. 1959; 137-167. (In Eng.)
[49] Chomsky N. On the notion "rule of grammar". Structure of language and its mathematical aspects. Providence (Rhode Island). 1961; 6-24. (In Eng.)
[50] Chomsky N. Formal properties of grammars. In: Luce R.D., Bush R.R., Galanter E. (eds). Handbook of mathematical psychology. New York, NY: John Wiley & Sons. 1963; 323-418. (In Eng.)
[51] Chomsky N., Miller G.A. Finite state languages. Information and Control. 1958; 91-112. (In Eng.)
[52] Chomsky N., Schützenberger M. P. The algebraic theory of context-free languages. In: Braffort P., Hirschberg D. (eds). Computer programming and formal systems. Amsterdam. 1963; 118-161. (In Eng.)
[53] Conklin J. Hypertext: An Introduction and Survey. Computer.1987; 20(9):17-41. (In Eng.) DOI: 10.1109/MC.1987.1663693
[54] Minsky M. Semantic Information Processing. MIT Press, Cambridge, MA, 1968. (In Eng.)
[55] Nelson T. Computer Lib / Dream machines. Sausalito, CA: Mindful Press, 1974. (In Eng.)
[56] Nelson T. Literary machines. Sausalito, CA: Mindful Press, 1993. (In Eng.)
Опубликована
2019-07-25
Как цитировать
LESNIKOV, Sergey Vladimirovich et al. Алгоритм разработки и создания цифрового академического словарного корпуса русского языка. Международный научный журнал «Современные информационные технологии и ИТ-образование», [S.l.], v. 15, n. 2, p. 362-374, july 2019. ISSN 2411-1473. Доступно на: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/510>. Дата доступа: 21 nov. 2019 doi: https://doi.org/10.25559/SITITO.15.201902.362-374.