A DYNAMIC INDEXING SCHEME FOR MULTIDIMENSIONAL DATA
Abstract
We present a new dynamic index structure for multidimensional data. The considered index structure is based on an extended grid file concept. Strengths and weaknesses of the grid files were analyzed. Based on that analysis we proposed to strengthen the concept of grid files by considering their stripes as linear hash tables, introducing the concept of chunk and representing the grid file structure as a graph. As a result we significantly reduced the amount of disk operations. Efficient algorithms for storage and access of index directory are proposed, in order to minimize memory usage and lookup operations complexities. Estimations of complexities for these algorithms are presented. A comparison of our approach to support effective grid file structure with other known approaches is presented. This comparison shows effectiveness of suggested metadata storage environment. An estimation of directory size is presented. A prototype to support of our grid file concept has been created and experimentally compared with MongoDB (a renowned NoSQL database). Comparison results show effectiveness of our approach in the cases of given point lookup, lookup by wide ranges and closest objects lookup when considering more than one dimension, and also better memory usage.
References
[2] Chang F., Dean J., Ghemawat S., Hsieh W.C., Wallach D.A., Burrows M., Chandra T., Fikes A., Gruber R.E. Bigtable: a distributed storage system for structured data. Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06), Seattle, Washington, November 06 – 08, 2006. p. 205-218. Available at: http://tab.d-thinker.org/showthread.php?tid=4 (accessed 11.01.18).
[3] Ding B., Konig A.C. Fast Set Intersection in Memory. Proceedings of the 37th International Conference on Very Large Databases (VLDB 2011), Seattle, Washington, USA: Very Large Data Bases Endowment Inc., 2011; 4(1-12):255–266. Available at: https://www.microsoft.com/en-us/research/wp-content/uploads/2011/01/p255-DingKoenig.pdf (accessed 11.01.18).
[4] Garcia-Molina H., Ullman J., Widom J. Database Systems: The Complete Book. Prentice Hall, USA. 2009. Available at: http://infolab.stanford.edu/~ullman/dscb.html (accessed 11.01.18).
[5] Gevorgyan G. An Effective Dynamic Structure for Grid file Organization. Russian-Armenian (Slavonic) University Bulletin. 2016; (l):5-17. Available at: https://elibrary.ru/item.asp?id=30752424 (accessed 11.01.18).
[6] Gevorgyan G. An Approach to Data Integration: Model, Algorithms and Verification // PhD thesis: National Academy of Sciences of the Republic of Armenia, Yerevan, 2016.
[7] Gevorgyan, G., Manukyan, M. Effective Algorithms to Support Grid Files. Russian-Armenian (Slavonic) University Bulletin. 2015; (2):22–38. Available at: https://elibrary.ru/item.asp?id=30752417 (accessed 11.01.18).
[8] Luo C., Hou W.C., Wang C.F., Want H., Yu X. Grid File for Efficient Data Cube Storage. Computers and their Applications. 2006. p. 424–429.
[9] Manukyan M., Gevorgyan G. Canonical Data Model for Data Warehouse. Communications in Computer and Information Science. 2016; 637:72-79. DOI: https://doi.org/10.1007/978-3-319-44066-8_8
[10] Manukyan M. Canonical Model: Construction Principles. Proceedings of the 16th International Conference on Information Integration and Web-based Applications & Services (iiWAS '14), Maria Indrawan-Santiago, Matthias Steinbauer, Hong-Quang Nguyen, A. Min Tjoa, Ismail Khalil, and Gabriele Anderst-Kotsis (Eds.). ACM, New York, NY, USA, 2014. p. 320-329. DOI: http://dx.doi.org/10.1145/2684200.2684278
[11] Manukyan M. On an Approach to Data Integration: Concept, Formal Foundations and Data Model. CEUR Workshop Proceedings. 2017; 2022:206-213. Available at: http://ceur-ws.org/Vol-2022/paper34.pdf (accessed 11.01.18).
[12] Nievergelt J., Hinterberger H., Sevcik K. The Grid File: An Adaptable, Symmetric, Multikey File Structure. ACM Transactions on Database Systems. 1984; 9(1):38–71. DOI: http://dx.doi.org/10.1145/348.318586
[13] Otoo E.J. A Mapping Function for the Directory of a Multidimensional Extendible Hashing. Proceedings of the 10th International Conference on Very Large Data Bases (VLDB '84), Umeshwar Dayal, Gunter Schlageter, and Lim Huat Seng (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1984. p. 493-506.
[14] Otoo E.J. A multidimensional digital hashing scheme for files with composite keys. Proceedings of the 1985 ACM SIGMOD international conference on Management of data (SIGMOD '85). ACM, New York, NY, USA, 1985. p. 214-229. DOI: http://dx.doi.org/10.1145/318898.318918
[15] Papadopoulos A.N., Manolopoulos Y., Theodoridis Y., Tsoras V. Grid File (and family). Encyclopedia of Database Systems, Springer, Boston, MA, 2009. p. 1279–1282.
[16] Regnier M. Analysis of Grid File Algorithms. BIT - Computer Science Section. 1985; 25(2):335–358.
[17] Robinson I., Webber J., Eifrem E. Graph Databases. 2nd ed. O’Reilly, USA, 2015. 237 p.
[18] Sharma S., Tim U.S., Wong J., Gadia S., Sharma S. A Brief Review on Leading Big Data Models. Data Science Journal. 2014; 13:138-157. DOI: https://doi.org/10.2481/dsj.14-041
[19] Stonebraker M., Brown P., Poliakov A., Raman S. The Architecture of SciDB. In: Bayard Cushing J., French J., Bowers S. (eds) Scientific and Statistical Database Management. SSDBM 2011. Lecture Notes in Computer Science. Vol. 6809. Springer, Berlin, Heidelberg, 2011. DOI: https://doi.org/10.1007/978-3-642-22351-8_1
[20] Brown P.G. Overview of SciDB: large scale array storage, processing and analysis. Proceedings of the 2010 ACM SIGMOD International Conference on Management of data (SIGMOD '10). ACM, New York, NY, USA, 2010. p. 963-968. DOI: https://doi.org/10.1145/1807167.1807271

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.