ESTIMATION METHODOLOGY OF THE LANGUAGE IDENTIFICATION RESULTS

  • Сергей Николаевич Калегин Moscow Research TV Institute Joint Stock Company; Institute of Control Sciences of the Russian Academy of Sciences http://orcid.org/0000-0003-3540-3902

Abstract

The article presents the author's methodology for evaluating the language identification results, developed in the course of experimental research and showing the effectiveness of appropriate methods, technologies, algorithms and software, as well as the shortcomings of existing approaches to solving this problem. This allows to evaluate the effectiveness of language identification programs and systems at the design stage, which significantly reduces the resource costs for their development.

Author Biography

Сергей Николаевич Калегин, Moscow Research TV Institute Joint Stock Company; Institute of Control Sciences of the Russian Academy of Sciences

chief of section, aspirant; applicant

References

1. Kuralenok I.E. Ocenka sistem tekstovogo poiska : dissertacija ... kandidata fiziko-matematicheskih nauk : 05.13.01. Sankt-Peterburg, 2004. – S. 112.
2. Gmurman V.E. Teorija verojatnostej i matematicheskaja statistika. – M.: Vysshaja shkola, 2003. – 479 s.
3. Rish I. An empirical study of the naive Bayes classifier / IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence.
4. Manning C., Raghavan P., Schütze H. Introduction to Information Retrieval. – Cambridge University Press, 2008.
5. Avtomaticheskij opredelitel' jazyka teksta Gusser [Jelektronnyj resurs]. URL: “Guesser.ru” http://guesser.ru/
6. Kalegin S.N. Ocenka jeffektivnosti metodov opredelenija jazykovoj prinadlezhnosti nestrukturirovannogo teksta i varianty ih programmnoj realizacii. Mezhdunarodnaja konferencija «CONCORT-2016», Nizhnij Novgorod, 2016.
7. Automatic language identifier (Avtomaticheskij opredelitel' jazyka) [Jelektronnyj resurs]. URL: http://labs.translated.net/
8. Avtomaticheskij opredelitel' jazyka teksta Poliglot 3000 (P3000) [Jelektronnyj resurs]. URL: http://www.polyglot3000.com/
9. Programma TextCat [Jelektronnyj resurs]. URL: http://odur.let.rug.nl/~vannoord/TextCat/
10. Language Identifier by Henrik Falck [Jelektronnyj resurs]. URL: http://whatlanguageisthis.com/
11. SILC RALI [Jelektronnyj resurs]. URL: http://rali.iro.umontreal.ca/rali/
12. Avtomaticheskij opredelitel' jazyka Talenknobbel [Jelektronnyj resurs]. URL: http://www.fuzzums.nl/~joost/talenknobbel/
13. Kalegin S.N. Avtomatizacija processa jazykovoj identifikacii teksta na osnove sushhestvujushhih reshenij / Nejrokomp'jutery: razrabotka, primenenie. № 1. – Moskva: Radiotehnika, 2017. – S. 56-65.
14. Kalegin S.N. Vazhnost' vybora osnovnogo identifikacionnogo principa pri proektirovanii jazykovyh opredelitelej // Sovremennye informacionnye tehnologii i IT-obrazovanie. Tom 12, № 2. – Moskva, 2016. – S. 194-204.
Published
2017-08-18
How to Cite
КАЛЕГИН, Сергей Николаевич. ESTIMATION METHODOLOGY OF THE LANGUAGE IDENTIFICATION RESULTS. Modern Information Technologies and IT-Education, [S.l.], v. 13, n. 2, p. 208-214, aug. 2017. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/242>. Date accessed: 21 oct. 2025. doi: https://doi.org/10.25559/SITITO.2017.2.242.
Section
Research and development in the field of new IT and their applications