ESTIMATION METHODOLOGY OF THE LANGUAGE IDENTIFICATION RESULTS
Abstract
The article presents the author's methodology for evaluating the language identification results, developed in the course of experimental research and showing the effectiveness of appropriate methods, technologies, algorithms and software, as well as the shortcomings of existing approaches to solving this problem. This allows to evaluate the effectiveness of language identification programs and systems at the design stage, which significantly reduces the resource costs for their development.
References
2. Gmurman V.E. Teorija verojatnostej i matematicheskaja statistika. – M.: Vysshaja shkola, 2003. – 479 s.
3. Rish I. An empirical study of the naive Bayes classifier / IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence.
4. Manning C., Raghavan P., Schütze H. Introduction to Information Retrieval. – Cambridge University Press, 2008.
5. Avtomaticheskij opredelitel' jazyka teksta Gusser [Jelektronnyj resurs]. URL: “Guesser.ru” http://guesser.ru/
6. Kalegin S.N. Ocenka jeffektivnosti metodov opredelenija jazykovoj prinadlezhnosti nestrukturirovannogo teksta i varianty ih programmnoj realizacii. Mezhdunarodnaja konferencija «CONCORT-2016», Nizhnij Novgorod, 2016.
7. Automatic language identifier (Avtomaticheskij opredelitel' jazyka) [Jelektronnyj resurs]. URL: http://labs.translated.net/
8. Avtomaticheskij opredelitel' jazyka teksta Poliglot 3000 (P3000) [Jelektronnyj resurs]. URL: http://www.polyglot3000.com/
9. Programma TextCat [Jelektronnyj resurs]. URL: http://odur.let.rug.nl/~vannoord/TextCat/
10. Language Identifier by Henrik Falck [Jelektronnyj resurs]. URL: http://whatlanguageisthis.com/
11. SILC RALI [Jelektronnyj resurs]. URL: http://rali.iro.umontreal.ca/rali/
12. Avtomaticheskij opredelitel' jazyka Talenknobbel [Jelektronnyj resurs]. URL: http://www.fuzzums.nl/~joost/talenknobbel/
13. Kalegin S.N. Avtomatizacija processa jazykovoj identifikacii teksta na osnove sushhestvujushhih reshenij / Nejrokomp'jutery: razrabotka, primenenie. № 1. – Moskva: Radiotehnika, 2017. – S. 56-65.
14. Kalegin S.N. Vazhnost' vybora osnovnogo identifikacionnogo principa pri proektirovanii jazykovyh opredelitelej // Sovremennye informacionnye tehnologii i IT-obrazovanie. Tom 12, № 2. – Moskva, 2016. – S. 194-204.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.