IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS

Abstract

Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms are used to solve the first problem. The main lack of them is that when using different heuristic algorithms to calculate the distance between the same pair DNA chains, we produce different values.
To make a comparative analysis of these heuristic algorithms, a badness index was introduced for all the triangles formed in the DNA matrix, and ideally it should be zero. This indicator was used in the future and to solve the second problem. Based on the fact that the badness indicator should be equal to zero, the algorithm of restoring the matrix of distances between DNA chains is developed. This algorithm is optimized using the branch and bound method. This method selects a sequence of calculations of unknown elements, in which the value of the badness matrix will be the smallest.
A part of the paper is devoted to the detailed description of algorithms. The most important among the auxiliary algorithms is the algorithm for selecting the separating element, i.e. the unknown element, which we recover first. The main algorithm, i.e. the actual recovering the DNA matrix by the method of branch and bound, includes a description of the task, which consists of subtasks containing a set of the transformed matrix, a sequence of already restored elements of the original matrix and the set of those not yet filled elements of the matrix, which cannot be selected in the next step of the algorithm. The paper also includes the program implementation of the described algorithms.

Author Biographies

Mikhail Eduardovich Abramyan, Southern Federal University

Associate Professor, Institute of Mathematics, Mechanics, and Computer Science named after of I.I. Vorovich, Ph.D. (Phys.-Math.)

Boris Feliksovich Melnikov, Russian State Social University

Professor, Department of Information Systems, Networks and Safety, Faculty of Information Technology, Dr. Sci. (Phys.-Math.)

Marina Anatolievna Trenina, Togliatti State University

senior lecture, Department of Applied Mathematics and Computer Science

References

[1] Toppi J., Vico Fallani F.De, Petti M., Veссhiato G., Maglione A.G., Cincotti F., Salinari S., Mattia D., Babiloni F., Astolfi L. A new statistical approach for the extraction of adjacency matrix from effective connectivity networks. Proceedings of the 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, 2013, pp. 2932-2935. (In Eng.) DOI: 10.1109/EMBC.2013.6610154
[2] Zhou J., Zhong P., Zhang T. A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks. Evolutionary Bioinformatics. 2016; 12:229-235. (In Eng.) DOI: 10.4137/EBO.S40474
[3] Alexeev N., Aidagulov R., Alekseyev M.A. A Computational Method for the Rate Estimation of Evolution-ary Transpositions. In: Ortuño F., Rojas I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science. Springer, Cham. 2015; 9043:471-480. (In Eng.) DOI: 10.1007/978-3-319-16483-0_46
[4] Evdokimova N.I., Kuznetsov A.V. Local patterns in the copy-move detection problem solution. Computer Optics. 2017; 41(1):79-87. (In Russ.) DOI: 10.18287/2412-6179-2017-41-1-79-87
[5] Eckes B., Nischt R., Krieg T. Cell-matrix interactions in dermal repair and scarring. Fibrogenesis and Tissue Repair. 2010; 3(4). 11 pp. (In Eng.) DOI: 10.1186/1755-1536-3-4
[6] Korabelshchikova S.Y., Melnikov B.F., Pivneva S.V., Zyablitseva L.V. Linear codes and some their applications. Journal of Physics: Conference Series. 2018; 1096:012174. (In Eng.) DOI: 10.1088/1742-6596/1096/1/012174
[7] Korabelshchikova S.Y., Melnikov B.F., Pivneva S.V., Zyablitseva L.V. Linear error correcting codes and their application in DNA analysis. Proceedings of the 4th International Conference on Information Technology and Nanotechnology (ITNT-2018). Samara, 2018, pp. 2095-2100. Available at: https://elibrary.ru/item.asp?id=34894963& (accessed 15.12.2018). (In Eng.)
[8] Melnikov B., Radionov A., Gumayunov V. Some Special Heuristics for Discrete Optimization Problems. Proceedings of the Eighth International Conference on Enterprise Information Systems: Databases and In-formation Systems Integration (ICEIS- 2006), Paphos, Cyprus, May 23-27, 2006, pp. 360-364. (In Eng.)
[9] Melnikov B. Discrete optimization problems - some new heuristic approaches. Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05), Beijing, 2005, pp. 8 pp.-82. (In Eng.) DOI: 10.1109/HPCASIA.2005.34
[10] Melnikov B.F., Melnikova E.A., Pivneva S.V., Dudnikov V.A., Davydova E.V. Geometric and game approaches for some discrete optimization problems. CEUR Workshop Proceedings. 2018; 2212:312-321. Available at: http://ceur-ws.org/Vol-2212/paper42.pdf (accessed 15.12.2018). (In Eng.)
[11] Forrest A.R.R. et al. A promoter-level mammalian expression atlas. Nature. 2014; 507:462-470. (In Eng.) DOI: 10.1038/nature13182
[12] Klyosov A.A., Faleeva T. Excavated DNA from Two Khazar Burials. Advances in Anthropology. 2017; 7:17-21. (In Eng.) DOI: 10.4236/aa.2017.71002
[13] Melnikov B.F., Pevneva S.V., Trifonov M.A. Multiheuristic approach to compare the quality of defined metrics on the set of dna sequences. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2017; 13(2):89-96. (In Russ.) DOI: 10.25559/SITITO.2017.2.235
[14] Melnikov B.F., Pivneva S.V., Trifonov M.A. Comparative analysis of the algorithms of calculating distances of DNA sequences and some related problems. Proceedings of the Third International Conference on Information Technology and Nanotechnology (ITNT-2017). Samara, 2017, pp. 1640-1645. Available at: https://elibrary.ru/item.asp?id=29267020 (accessed 15.12.2018). (In Eng.)
[15] Melnikov B.F., Pivneva S.V., Trifonov M.A. Various algorithms, calculating distances of DNA sequences, and some computational recommendations for use such algorithms. CEUR Workshop Proceedings. 2017; 1902:43-47. Available at: http://ceur-ws.org/Vol-1902/paper8.pdf (accessed 15.12.2018). (In Eng.)
[16] Melnikov B.F., Melnikova E.A., Pivneva S.V., Trenina M.A. An approach to analysis of the similarity of DNA-sequences. CEUR Workshop Proceedings. 2018; 2212:63-72. Available at: http://ceur-ws.org/Vol-2212/paper9.pdf (accessed 15.12.2018). (In Eng.)
[17] Melnikov B.F., Trenina M.A. On a problem of the reconstruction of distance matrices between DNA sequences. International Journal of Open Information Technologies. 2018; 6(6):1-13. Available at: https://elibrary.ru/item.asp?id=35050442 (accessed 15.12.2018). (In Russ.)
[18] Melnikov B.F., Trenina M.A., Kochergin A.S. On one problem of reconstructing matrix distances between chains of DNA. IFAC-PapersOnLine. 2018; 51(32):378-383. (In Eng.) DOI: 10.1016/j.ifacol.2018.11.413
[19] Melnikov B.F., Trenina M.A. The application of the branch and bound method in the problem of reconstructing the matrix of distances between DNA strings. International Journal of Open Information Technologies. 2018; 6(8):1-13. Available at: https://elibrary.ru/item.asp?id=35392804 (accessed 15.12.2018). (In Russ.)
[20] Melnikov B.F., Trenina M.A. On possible methods for solving the problem of reconstructing the matrix of distances between DNA strings. CEUR Workshop Proceedings. 2018; 2258:11-20. Available at: http://ceur-ws.org/Vol-2258/paper2.pdf (accessed 15.12.2018). (In Eng.)
[21] Goodman S.E., Hedetniemi S.T. Introduction to the Design and Analysis of Algorithms. NY: McGraw-Hill, 1977. 371 pp. (In Eng.)
[22] Hromkovič J. Algorithmics for Hard Problems. Introduction to Combinatorial Optimization, Randomiza-tion, Approximation, and Heuristics. Springer-Verlag Berlin Heidelberg, 2004. 538 pp. (In Eng.) DOI: 10.1007/978-3-662-05269-3
[23] Needleman S., Wunsch Ch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology. 1970; 48(3):443-453. (In Eng.) DOI: 10.1016/0022-2836(70)90057-4
[24] Melnikov B.F., Trenina M.A., Kochergin A.S. An approach to improving algorithms for computing distances between DNA chains (by the example of the Needleman – Wunsch algorithm). University proceedings. Volga region. Physical and mathematical sciences. 2018; 1(45):46-59. (In Russ.) DOI: 10.21685/2072-3040-2018-1-4.
[25] Ayala F.J., Kiger J.A. Jr. Modern Genetics (Second edition). Menlo Park and London, 1984. 923 pp. (In Eng.)
Published
2019-04-19
How to Cite
ABRAMYAN, Mikhail Eduardovich; MELNIKOV, Boris Feliksovich; TRENINA, Marina Anatolievna. IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS. Modern Information Technologies and IT-Education, [S.l.], v. 15, n. 1, p. 81-91, apr. 2019. ISSN 2411-1473. Available at: <http://sitito.cs.msu.ru/index.php/SITITO/article/view/475>. Date accessed: 01 nov. 2025. doi: https://doi.org/10.25559/SITITO.15.201901.81-91.