Smoothing the Boundaries between the Domains of the Surface Computational Mesh
Abstract
For the efficient use of supercomputer computing resources in solving problems of simulation, a high-quality decomposition of the computational meshes is of great importance. Modern computational problems work with huge input data, which makes it impossible to process them on one computational node in a reasonable time. In particular, to carry out calculations on large computational meshes, it is necessary to decompose them into separate domains. In this case, each of the domains is processed at its own computational node, and to maintain the integrity of the problem, computational nodes exchange between themselves computational data that cross the boundaries of the computational mesh domains. The quality of the computational mesh decomposition is characterized by such parameters as the uniformity of the distribution of cells over domains, the maximum length of the boundary between two domains, the total length of all boundaries between all pairs of domains, and many others. The most important indicator of the mesh decomposition quality is the uniformity of the distribution of cells across domains, since the speed of computation of the entire problem is characterized primarily by the speed of processing the largest domain. However, the maximum length of the boundary between pairs of domains is also of high importance, since the value of this parameter determines the speed of data exchanges between computational nodes. With a decrease in the time step in computational problems and an increase in the number of computational nodes (a high degree of parallelization), the proportion of interprocess communications increases significantly, and a decrease in the length of the boundary between domains significantly affects the total time of the computational problem. This article discusses an algorithm for smoothing the boundaries between domains, in the process of which the lengths of the boundaries are reduced, while not impairing the uniformity of the distribution of mesh cells between domains. Numerical experiments have shown that the application of the proposed algorithm leads to a reduction in the lengths of the boundaries between domains by about 10%, which leads to acceleration of data exchange in the course of calculations.
References
2. Minami K. Supercomputers and Application Performance. In: Ed. by M. Geshi. The Art of High Performance Computing for Computational Science, vol. 2. Springer, Singapore; 2019. p. 1-9. (In Eng.) DOI: https://doi.org/10.1007/978-981-13-9802-5_1
3. Savin G.I., Benderskiy L.A., Lyubimov D.A., Rybakov A.A. RANS/ILES Method Optimization for Effective Calculations on Supercomuter. Lobachevskii Journal of Mathematics. 2019; 40(5):566-573. (In Eng.) DOI: https://doi.org/10.1134/S1995080219050172
4. Utkin A.V., Fomin V.M., Golovneva E.I. Parallel molecular dynamics for silicon and silicon carbide: MPI, CUDA and CUDA-MPI Implementation. AIP Conference Proceedings. 2020; 2288(1):030083. (In Eng.) DOI: https://doi.org/10.1063/5.0028297
5. Wende F., Marsman M., Zhao Z., Kim J. Porting VASP from MPI to MPI+OpenMP [SIMD]. In: Ed. by B. de Supinski, S. Olivier, C. Terboven, B. Chapman, M. Müller. Scaling OpenMP for Exascale Performance and Portability. IWOMP 2017. Lecture Notes in Computer Science, vol. 10468. Springer, Cham; 2017. p. 107-122. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-65578-9_8
6. Golovchenko E., Dorofeeva E., Gasilova I., Boldareva A. Numerical Experiments with New Algorithms for Parallel Decomposition of Large Computational Meshes. In: Advances in Parallel Computing. Parallel Computing: Accelerating Computational Science and Engineering (CSE). 2014; 25:441-450. (In Eng.) DOI: https://doi.org/10.3233/978-1-61499-381-0-441
7. AbdulRazzaq A.A., Hamad Q.S., Taha A.M. Parallel implementation of maximum-shift algorithm using OpenMP. Indonesian Journal of Electrical Engineering and Computer Science. 2021; 22(3):1529-1539. (In Eng.) DOI: https://doi.org/10.11591/ijeecs.v22.i3.pp1529-1539
8. Aznaveh M., Chen J., Davis T.A., et al. Parallel GraphBLAS with OpenMP. 2020 Proceedings of the SIAM Workshop on Combinatorial Scientific Computing. Society for Industrial and Applied Mathematics; 2020. p. 138-148. (In Eng.) DOI: https://doi.org/10.1137/1.9781611976229.14
9. Suciu A., Hangan A., Marginean A., et al. Parallel implementation of a PIC simulation algorithm using OpenMP. Proceedings of the Federal Conference on Computer Science and Information Systems (FedCSIS). IEEE Press, Sofia, Bulgaria; 2020. p. 381-385. (In Eng.) DOI: https://doi.org/10.15439/2020F130
10. McDoniel W., Höhnerbach M., Canales R., Ismail A.E., Bientinesi P. LAMMPS' PPPM Long-Range Solver for the Second Generation Xeon Phi. In: Ed. by J. M. Kunkel, R. Yokota, P. Balaji, D. Keyes. High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science, vol. 10266. Springer, Cham; 2017. p. 61-78. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-58667-0_4
11. Malas T., Kurth T., Deslippe J. Optimization of the Sparse Matrix-Vector Products of an IDR Krylov Iterative Solver in EMGeo for the Intel KNL Manycore Processor. In: Ed. by M. Taufer, B. Mohr, J. Kunkel. High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, vol. 9945. Springer, Cham; 2016. p. 378-389. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-46079-6_27
12. Cook B., Maris P., Shao M. High Performance Optimizations for Nuclear Physics Code MFDn on KNL. In: Ed. by M. Taufer, B. Mohr, J. Kunkel. High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, vol. 9945. Springer, Cham; 2016. p. 366-377. (In Eng.) DOI: https://doi.org/10.1007/978-3-319-46079-6_26
13. Ka Sing J.T., Ullah Sheikh U., Mokji M., et al. Hadamard Transform Improvement for HEVC using Intel AVX-512. 2019 IEEE 9th Symposium on Computer Applications & Industrial Electronics (ISCAIE). IEEE Press, Malaysia; 2019. p. 310-315. (In Eng.) DOI: https://doi.org/10.1109/ISCAIE.2019.8743904
14. Shabanov B.M., Rybakov A.A., Shumilin S.S. Vectorization of High-performance Scientific Calculations Using AVX-512 Intruction Set. Lobachevskii Journal of Mathematics. 2019; 40(5):580-598. (In Eng.) DOI: https://doi.org/10.1134/S1995080219050196
15. Rybakov A.A., Shumilin S.S. Vectorization of the Riemann solver using the AVX-512 instruction set. Program Systems: Theory and Applications. 2019; 10(3):41-58. (In Eng.) DOI: https://doi.org/10.25209/2079-3316-2019-10-3-41-58
16. Urschel J.C., Zikatanov L.T. Spectral bisection of graphs and connectedness. Linear Algebra and its Applications. 2014; 449:1-16. (In Eng.) DOI: https://doi.org/10.1016/j.laa.2014.02.007
17. Zhao L., Liu Y., Zhang C., Zhang X. Automatic optimal block decomposition for structured mesh generation using genetic algorithm. Journal of the Brazilian Society of Mechanical Sciences and Engineering. 2019; 41(1):10. (In Eng.) DOI: https://doi.org/10.1007/s40430-018-1510-0
18. Karypis G., Kumar V. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM Journal on Scientific Computing. 1998; 20(1):359-392. (In Eng.) DOI: https://doi.org/10.1137/S1064827595287997
19. Golovchenko E.N. Obzor algoritmov dekompozicii grafov [Survey of graph partitioning algorithms]. Keldysh Institute PREPRINTS. 2020; (2):1-38. (In Russ., abstract in Eng.) DOI: https://doi.org/10.20948/prepr-2020-2
20. Farhat C. A simple and efficient automatic fem domain decomposer. Computers & Structures. 1988; 28(5):579-602. (In Eng.) DOI: https://doi.org/10.1016/0045-7949(88)90004-1
21. Preis R., Diekmann R. PARTY - a Software Library for Graph Partitioning. In: Ed. by B. H. V. Topping. Advances in Computational Mechanics for Parallel and Distributed Processing. Civil-Comp Press, Edinburgh, UK; 1997. p. 63-71. (In Eng.) DOI: https://doi.org/10.4203/ccp.45.3.1
22. Yakobovskii M.V. Inkremental'nyj algoritm dekompozicii grafov [An incremental algorithm for graph decomposition]. Vestn. Nizhegorod. Univ. im. N.I. Lobachevskogo. Ser. Mat. Model. Optim. Upr. 2005; (1):243-250. (In Russ.)
23. Rybakov A.A., Chopornyak A.D. Decomposition of a Surface Unstructured Computation-al Mesh for Scaling Computations on a Supercomputer. Sovremennye informacionnye tehnologii i IT-obrazovanie = Modern Information Technologies and IT-Education. 2020; 16(4):851-861. (In Russ., abstract in Eng.) DOI: https://doi.org/10.25559/SITITO.16.202004.851-861
24. Wright N.J., Dosanjh S.S., Andrews A.K., et al. Cori: A Pre-Exascale Supercomputer for Big Data and HPC Applications. Advances in Parallel Computing. Big Data and High Performance Computing, vol. 26. IOS Press; 2015. p. 82-100. (In Eng.) DOI: https://doi.org/10.3233/978-1-61499-583-8-82
25. Bourgault-Côté S., Hasanzadeh K., Lavoie P., Laurendeau E. Multi-Layer Icing Methodologies for Conservative Ice Growth. In: 7th European Conference for Aeronautics and Aerospace Sciences (EUCASS). EUCASS association; 2017. p. 1-15. (In Eng.) DOI: https://doi.org/10.13009/EUCASS2017-258

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.