IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
Abstract
Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms are used to solve the first problem. The main lack of them is that when using different heuristic algorithms to calculate the distance between the same pair DNA chains, we produce different values.
To make a comparative analysis of these heuristic algorithms, a badness index was introduced for all the triangles formed in the DNA matrix, and ideally it should be zero. This indicator was used in the future and to solve the second problem. Based on the fact that the badness indicator should be equal to zero, the algorithm of restoring the matrix of distances between DNA chains is developed. This algorithm is optimized using the branch and bound method. This method selects a sequence of calculations of unknown elements, in which the value of the badness matrix will be the smallest.
A part of the paper is devoted to the detailed description of algorithms. The most important among the auxiliary algorithms is the algorithm for selecting the separating element, i.e. the unknown element, which we recover first. The main algorithm, i.e. the actual recovering the DNA matrix by the method of branch and bound, includes a description of the task, which consists of subtasks containing a set of the transformed matrix, a sequence of already restored elements of the original matrix and the set of those not yet filled elements of the matrix, which cannot be selected in the next step of the algorithm. The paper also includes the program implementation of the described algorithms.
References
[2] Zhou J., Zhong P., Zhang T. A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks. Evolutionary Bioinformatics. 2016; 12:229-235. (In Eng.) DOI: 10.4137/EBO.S40474
[3] Alexeev N., Aidagulov R., Alekseyev M.A. A Computational Method for the Rate Estimation of Evolution-ary Transpositions. In: Ortuño F., Rojas I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science. Springer, Cham. 2015; 9043:471-480. (In Eng.) DOI: 10.1007/978-3-319-16483-0_46
[4] Evdokimova N.I., Kuznetsov A.V. Local patterns in the copy-move detection problem solution. Computer Optics. 2017; 41(1):79-87. (In Russ.) DOI: 10.18287/2412-6179-2017-41-1-79-87
[5] Eckes B., Nischt R., Krieg T. Cell-matrix interactions in dermal repair and scarring. Fibrogenesis and Tissue Repair. 2010; 3(4). 11 pp. (In Eng.) DOI: 10.1186/1755-1536-3-4
[6] Korabelshchikova S.Y., Melnikov B.F., Pivneva S.V., Zyablitseva L.V. Linear codes and some their applications. Journal of Physics: Conference Series. 2018; 1096:012174. (In Eng.) DOI: 10.1088/1742-6596/1096/1/012174
[7] Korabelshchikova S.Y., Melnikov B.F., Pivneva S.V., Zyablitseva L.V. Linear error correcting codes and their application in DNA analysis. Proceedings of the 4th International Conference on Information Technology and Nanotechnology (ITNT-2018). Samara, 2018, pp. 2095-2100. Available at: https://elibrary.ru/item.asp?id=34894963& (accessed 15.12.2018). (In Eng.)
[8] Melnikov B., Radionov A., Gumayunov V. Some Special Heuristics for Discrete Optimization Problems. Proceedings of the Eighth International Conference on Enterprise Information Systems: Databases and In-formation Systems Integration (ICEIS- 2006), Paphos, Cyprus, May 23-27, 2006, pp. 360-364. (In Eng.)
[9] Melnikov B. Discrete optimization problems - some new heuristic approaches. Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05), Beijing, 2005, pp. 8 pp.-82. (In Eng.) DOI: 10.1109/HPCASIA.2005.34
[10] Melnikov B.F., Melnikova E.A., Pivneva S.V., Dudnikov V.A., Davydova E.V. Geometric and game approaches for some discrete optimization problems. CEUR Workshop Proceedings. 2018; 2212:312-321. Available at: http://ceur-ws.org/Vol-2212/paper42.pdf (accessed 15.12.2018). (In Eng.)
[11] Forrest A.R.R. et al. A promoter-level mammalian expression atlas. Nature. 2014; 507:462-470. (In Eng.) DOI: 10.1038/nature13182
[12] Klyosov A.A., Faleeva T. Excavated DNA from Two Khazar Burials. Advances in Anthropology. 2017; 7:17-21. (In Eng.) DOI: 10.4236/aa.2017.71002
[13] Melnikov B.F., Pevneva S.V., Trifonov M.A. Multiheuristic approach to compare the quality of defined metrics on the set of dna sequences. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2017; 13(2):89-96. (In Russ.) DOI: 10.25559/SITITO.2017.2.235
[14] Melnikov B.F., Pivneva S.V., Trifonov M.A. Comparative analysis of the algorithms of calculating distances of DNA sequences and some related problems. Proceedings of the Third International Conference on Information Technology and Nanotechnology (ITNT-2017). Samara, 2017, pp. 1640-1645. Available at: https://elibrary.ru/item.asp?id=29267020 (accessed 15.12.2018). (In Eng.)
[15] Melnikov B.F., Pivneva S.V., Trifonov M.A. Various algorithms, calculating distances of DNA sequences, and some computational recommendations for use such algorithms. CEUR Workshop Proceedings. 2017; 1902:43-47. Available at: http://ceur-ws.org/Vol-1902/paper8.pdf (accessed 15.12.2018). (In Eng.)
[16] Melnikov B.F., Melnikova E.A., Pivneva S.V., Trenina M.A. An approach to analysis of the similarity of DNA-sequences. CEUR Workshop Proceedings. 2018; 2212:63-72. Available at: http://ceur-ws.org/Vol-2212/paper9.pdf (accessed 15.12.2018). (In Eng.)
[17] Melnikov B.F., Trenina M.A. On a problem of the reconstruction of distance matrices between DNA sequences. International Journal of Open Information Technologies. 2018; 6(6):1-13. Available at: https://elibrary.ru/item.asp?id=35050442 (accessed 15.12.2018). (In Russ.)
[18] Melnikov B.F., Trenina M.A., Kochergin A.S. On one problem of reconstructing matrix distances between chains of DNA. IFAC-PapersOnLine. 2018; 51(32):378-383. (In Eng.) DOI: 10.1016/j.ifacol.2018.11.413
[19] Melnikov B.F., Trenina M.A. The application of the branch and bound method in the problem of reconstructing the matrix of distances between DNA strings. International Journal of Open Information Technologies. 2018; 6(8):1-13. Available at: https://elibrary.ru/item.asp?id=35392804 (accessed 15.12.2018). (In Russ.)
[20] Melnikov B.F., Trenina M.A. On possible methods for solving the problem of reconstructing the matrix of distances between DNA strings. CEUR Workshop Proceedings. 2018; 2258:11-20. Available at: http://ceur-ws.org/Vol-2258/paper2.pdf (accessed 15.12.2018). (In Eng.)
[21] Goodman S.E., Hedetniemi S.T. Introduction to the Design and Analysis of Algorithms. NY: McGraw-Hill, 1977. 371 pp. (In Eng.)
[22] Hromkovič J. Algorithmics for Hard Problems. Introduction to Combinatorial Optimization, Randomiza-tion, Approximation, and Heuristics. Springer-Verlag Berlin Heidelberg, 2004. 538 pp. (In Eng.) DOI: 10.1007/978-3-662-05269-3
[23] Needleman S., Wunsch Ch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology. 1970; 48(3):443-453. (In Eng.) DOI: 10.1016/0022-2836(70)90057-4
[24] Melnikov B.F., Trenina M.A., Kochergin A.S. An approach to improving algorithms for computing distances between DNA chains (by the example of the Needleman – Wunsch algorithm). University proceedings. Volga region. Physical and mathematical sciences. 2018; 1(45):46-59. (In Russ.) DOI: 10.21685/2072-3040-2018-1-4.
[25] Ayala F.J., Kiger J.A. Jr. Modern Genetics (Second edition). Menlo Park and London, 1984. 923 pp. (In Eng.)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.
