Approximate Search Algorithms in Knowledge Discovery

  • Slobodan Petrović Норвежский университет естественных и технических наук http://orcid.org/0000-0002-4435-2716
  • Julia Sidorova Технологический институт Блекинге

Аннотация

Knowledge discovery in big data is one of the most important applications of computing machinery today. Search is essential part of all such procedures. Search algorithms must be extremely efficient, but at the same time knowledge discovery procedures must not produce too many false positives or false negatives. False positives require post-processing, which reduces the overall efficiency of the knowledge discovery procedures, while false negatives reduce the sensitivity of such procedures. To reduce the false positive and false negative rate, in this paper, constrained approximate search algorithms are proposed to be applied. An overview of search theory, exact and approximate, is given first, exposing fundamentals of dynamic programming-based and bit-parallel-based approximate search algorithms without constraints. Then, introduction of constraints specific for various knowledge discovery procedures is explained, together with the subtleties of various applications, such as SPAM filtering and digital and network forensics (file carving, intrusion detection in hosts and networks). Advantages and disadvantages of applications of such constrained search algorithms in knowledge discovery procedures are also discussed. A potential application in bioinformatics is outlined.

Сведения об авторах

Slobodan Petrović, Норвежский университет естественных и технических наук

профессор кафедры информационной безопасности и коммуникационных технологий, факультет информационных технологий и электротехники

Julia Sidorova, Технологический институт Блекинге

доцент кафедры информатики

Опубликована
2020-05-25
Как цитировать
PETROVIĆ, Slobodan; SIDOROVA, Julia. Approximate Search Algorithms in Knowledge Discovery. Международный научный журнал «Современные информационные технологии и ИТ-образование», [S.l.], v. 16, n. 1, may 2020. ISSN 2411-1473. Доступно на: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/624>. Дата доступа: 09 aug. 2020
Раздел
Теоретические вопросы информатики, прикладной математики, компьютерных наук