LANGUAGE IDENTIFICATION OF INFORMATION BLOCKS BASED ON LEXICO-GRAMMATIC MARKERS

Abstract

This article is a continuation of the author's series of publications on the subjects of language identification of texts. In the article is being considered the creation of a technological basis for language identification systems of unstructured information blocks based on lexico-grammatical markers, in which are used the forms of verbs, verbal formations or functionally analogous constructions, are described method and algorithm for its software implementation. These developments will significantly reduce the resource intensity and improve the quality of such systems, which will give a significant economic effect and the possibility of creating fundamentally new technologies for determining the linguistic affiliation of information in a multilingual environment. Consequently, the study is of interest for computer linguists and developers of automatic word processing systems, such as: global monitoring systems, multilingual knowledge bases, automatic translation systems, information retrieval systems, document summarizing systems, literature catalogers, etc.

Author Biography

Сергей Николаевич Калегин, Moscow research institute of television (CJSC MNITI)

aspirant (applicant), chief of section

References

1. Chernovalyuk I.V. Referirovanie nauchnogo teksta // Metodicheskie rekomendatsii po russkomu yazyku dlya inostrannykh studentov i aspirantov CHast' II. – Odessa, 2012. – 47 s.
2. Puzyryov А.V. O razgranichenii ponyatij «opornye i klyuchevye ehlementy khudozhestvennogo teksta» // Аktual'nye problemy filologii i pedagogicheskoj lingvistiki, № 16. Vladikavkaz, 2014. – s. 308-317.
3. Burlak S. А. Proiskhozhdenie yazyka: Fakty, issledovaniya, gipotezy. M: Аstrel', 2011. – 464 s.
4. Gromova O.E. Аllo! Lyalya? Rech'. Pervye glagoly. Dlya detej do 2 let / M.: Karapuz, 2003.
5. YAnushko E.А. Razvitie rechi. Pervye glagoly. Dlya detej ot 1 goda / M.: Eksmo, 2011.
6. Patent RF № 2607989, MPK G06F 17/27. Sposob avtomatizirovannogo opredeleniya yazyka ili yazykovoj gruppy teksta / Kalegin S.N.; ZАO «MNITI»; zayav. 08.07.2015; opubl. 11.01.2017, byul. № 2.
7. Kalegin S.N. Logicheskaya struktura identifikatsionnykh naborov v sistemakh yazykovoj identifikatsii // Sovremennye informatsionnye tekhnologii i IT-obrazovanie. - 2016. - Tom 12, № 1. – S. 173-181.
Published
2017-12-03
How to Cite
КАЛЕГИН, Сергей Николаевич. LANGUAGE IDENTIFICATION OF INFORMATION BLOCKS BASED ON LEXICO-GRAMMATIC MARKERS. Modern Information Technologies and IT-Education, [S.l.], v. 13, n. 4, p. 225-231, dec. 2017. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/326>. Date accessed: 11 oct. 2025. doi: https://doi.org/10.25559/SITITO.2017.4.492.
Section
Research and development in the field of new IT and their applications