Using Metadata and Element Norms to Optimize Queries
Abstract
In view of the rapid development of information technologies, the requirements for information and the speed of its processing in this area are also developing proportionally rapidly. Data and queries are increasingly different from operating with the simplest types. The article discusses a method for improving the efficiency of solving problems of searching for elements among wide tables – having a large number of columns, which can also be considered in the context of the task of connecting tables. The method is based on specific indexing of elements, which allows you to remove some restrictions from the table data and optimize work with a large number of columns – thereby expanding the range of applications of this approach, without losing the advantages of classical indexing. The proposed indexing is based on an introduced concept close to the concept of space–weight norms, which allows forming an equivalence relation on a set of table elements, at the same time removing some restrictions on the use of indexing from elements – such as, for example, the ratio of the order of characteristics or the restriction on the processed data types of the table themselves. The article also discusses the possibility of grouping and distributing data to parallelize the methods of processing DBMS queries. To implement such parallel data processing, a simple principle of symmetrical horizontal distribution is used, based on the factor-set of the set of elements of the table built on the basis of the introduced concept of element weight. It can allow you to distribute the elements of the table among the calculators fairly evenly, despite the fact that there will be no intersection of the characteristics of the element of interest between different calculators, to protect against memory exchange between processors.
References
2. Haynes D., Ray S., Manson S.M., Soni A. High performance analysis of big spatial data. In: 2015 IEEE International Conference on Big Data (Big Data). Santa Clara, CA, USA: IEEE Computer Society; 2015. p. 1953-1957. doi: https://doi.org/10.1109/BigData.2015.7363974
3. Abdel-Basset M., Manogaran G., Abdel-Fatah L., Mirjalili S. An improved nature inspired meta-heuristic algorithm for 1-D bin packing problems. Personal and Ubiquitous Computing. 2018;22(5-6):1117-1132. doi: https://doi.org/10.1007/s00779-018-1132-7
4. Chamoso P., Rivas A., Sánchez-Torres R., Rodríguez S. Social computing for image matching. PLoS ONE. 2018;13(5):e0197576. doi: https://doi.org/10.1371/journal.pone.0197576
5. Dodonov A., Mukhin V., Zavgorodnii V., Kornaga Y., Zavgorodnya A., Mukhin O. Method of Parallel Information Object Search in Unified Information Spaces. International Journal of Computer Network and Information Security. 2021;13(4):1-13. doi: https://doi.org/10.5815/ijcnis.2021.04.01
6. Levin N.A., Munerman V.I. Models of Big Data Processing in Massively Parallel Systems. Highly available systems. 2013;9(1):035-043. Available at: https://elibrary.ru/item.asp?id=18928468 (accessed 28.09.2022).
7. Alam K.S., Shishir T.A., Azharul Hasan K.M. Efficient Partitioning Algorithm for Parallel Multidimensional Matrix Operations by Linearization. In: Senjyu T., Mahalle P.N., Perumal T., Joshi A. (eds.). Information and Communication Technology for Intelligent Systems. ICTIS 2020. Smart Innovation, Systems and Technologies. Vol. 195. Singapore: Springer; 2021. p. 141-149. doi: https://doi. org/10.1007/978-981-15-7078-0_13
8. Chen Y., Li K., Yang W., Xiao G., Xie X., Li T. Performance-Aware Model for Sparse Matrix-Matrix Multiplication on the Sunway TaihuLight Supercomputer. IEEE Transactions on Parallel and Distributed Systems. 2019;30(4):923-938. doi: https://doi. org/10.1109/TPDS.2018.2871189
9. Pushpa Rani Suri, Sudesh Rani. A New Classification for Architecture of Parallel Databases. Information Technology Journal. 2008;7(7):983-991. doi: https://doi.org/10.3923/itj.2008.983.991
10. Zaki M.J. Parthasarathy S., Ogihara M. Parallel Algorithms for Discovery of Association Rules. Data Mining and Knowledge Discovery. 1997;1(4):343-373. doi: https://doi.org/10.1023/A:1009773317876
11. Wajszczyk B., Gruszka I.M. Analysis of possibilities to increase the efficiency of the relative database management system using the methods of parallel processing. In: Kaniewski P., Matuszewski J. (eds.) Proceedings of SPIE. Vol. 11442. Radioelectronic Systems Conference 2019. Article number: 1144215. SPIE; 2020. doi: https://doi.org/10.1117/12.2565744
12. Gorokhovatskyi V.A., Gorokhovatskiy A.V., Peredrii Ye.O. Hashing of structural descriptions at building of the class image descriptor, computing of relevance and classification of the visual objects. Telecommunications and Radio Engineering. 2018;77(13):1159-1168. doi: https://doi.org/10.1615/TelecomRadEng.v77.i13.40
13. Kirikova A., Mironov A. Using Metadata-indexing to Improve the Efficiency of Complex Operations. In; 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). St. Petersburg, Moscow, Russia; IEEE Computer Society; 2021. p. 2124-2127. doi: https://doi.org/10.1109/ElConRus51938.2021.9396274
14. Kirikova A., Mironov A., Munerman V. The Method of Composition Hash-functions for Optimize a Task of Searching Images in Dataset. In: 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). St. Petersburg and Moscow, Russia: IEEE Computer Society; 2020. p. 1983-1986. https://doi.org/10.1109/EIConRus49466.2020.9038919
15. Munerman V., Munerman D. Realization of Distributed Data Processing on the Basis of Container Technology. In: 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). Saint Petersburg and Moscow, Russia: IEEE Computer Society; 2019. p. 1740-1744. doi: https://doi.org/10.1109/EIConRus.2019.8656766
16. Munerman V., Munerman D., Samoilova T. The Heuristic Algorithm For Symmetric Horizontal Data Distribution. In: 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). St. Petersburg, Moscow, Russia: IEEE Computer Society; 2021. p. 2161-2165. doi: https://doi.org/10.1109/ElConRus51938.2021.9396510
17. Lomet D. The evolution of effective B-tree: Page organization and techniques: A personal account. ACM SIGMOD Record. 2001;30(3):64-69. doi: https://doi.org/10.1145/603867.603878
18. Rodríguez-Mazahua N., Rodríguez-Mazahua L., López-Chau A., Alor-Hernández G., Machorro-Cano I. Decision-Tree-Based Horizontal Fragmentation Method for Data Warehouses. Applied Sciences. 2022;12(21):10942. doi: https://doi.org/10.3390/app122110942
19. Graefe G. Modern B-Tree Techniques. Foundations and Trends in Databases. 2011;3(4):203-402. doi: https://doi.org/10.1561/1900000028
20. Lvovich I., Lvovich Y., Preobrazhenskiy A., Choporov O. Modeling and Optimization of Processing Large Data Arrays in Information Systems. In: 2021 International Conference on Information Technology and Nanotechnology (ITNT). Samara, Russian Federation: IEEE Computer Society; 2021. p. 1-5. doi: https://doi.org/10.1109/ITNT52450.2021.9649229
21. Monga V., Evans B.L. Perceptual Image Hashing Via Feature Points: Performance Evaluation and Tradeoffs. IEEE Transactions on Image Processing. 2006;15(11):3452-3465. doi: https://doi.org/10.1109/TIP.2006.881948
22. Sridhar R., Chandrasekaran M., Sriramya C., Page T. Optimization of heterogeneous Bin packing using adaptive genetic algorithm. IOP Conference Series: Materials Science and Engineering. 2017;183(1):012026. doi: https://doi.org/10.1088/1757-899X/183/1/012026
23. Syrotkina O., Aleksieiev M., Moroz B., Matsiuk S., Shevtsova O., Kozlovskyi A. Mathematical Methods for optimizing Big Data Processing. In: 2020 10th International Conference on Advanced Computer Information Technologies (ACIT). Deggendorf, Germany: IEEE Computer Society; 2020. p. 170-176. doi: https://doi.org/10.1109/ACIT49673.2020.9208940
24. Zobel J., Moffat A., Sacks-Davis R. An Efficient Indexing Technique for Full Text Databases. In: Proceedings of the 18th International Conference on Very Large Data Bases (VLDB '92). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.; 1992. p. 352-362.
25. Zakharov V., Kirikova A., Munerman V., Samoilova T. Architecture of Software-Hardware Complex for Searching Images in Database. In: 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). Saint Petersburg and Moscow, Russia: IEEE Computer Society; 2019. p. 1735-1739. doi: https://doi.org/10.1109/EIConRus.2019.8657241

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.