Efficient Reordering of Multiple Memory Operations in a Multithreaded Program
Abstract
This paper is devoted to various methods of constructing all kinds of linear orders for partially ordered sets. The complexity of the problem lies in the fact that even for a set of small size, constructing and checking for any feature of its possible linear orders can lead to the need of enumeration of huge number of options. A special case of this general problem is the construction of all possible reorderings for memory operations in multithreaded programs, taking into account the memory model. The goal of this work is to reduce the search space in this case. An important practical example is the consideration of memory models that allow buffering of downloads. In this case, it becomes possible to establish equivalence classes over permutations of operations. This in turn is of great importance for memory verification in multithreaded systems. Today, litmus tests are used for this type of verification, which have their limitations. In this paper, an algorithm is proposed that takes into account equivalence classes when reordering memory operations, which can significantly reduce both the number of necessary permutations and the time spent on their enumeration. The results of the study allow us to effectively generalize the concept of litmus tests to arbitrary partially ordered sets and significantly expand their applicability.
References
2. Banks J., et al. Using TPA to count linear extensions. Journal of Discrete Algorithms. 2018;51:1-11. https://doi.org/10.1016/j.jda.2018.04.001
3. Bennett A.J., Field T., Harrison P. Modelling and validation of shared memory coherency protocols. Performance Evaluation. 1996;27-28:541-563. https://doi.org/10.1016/S0166-5316(96)90045-0
4. Petrot F., Greiner A., Gomez P. On Cache Coherency and Memory Consistency Issues in NoC Based Shared Memory Multiprocessor SoC Architectures. In: 9th EUROMICRO Conference on Digital System Design (DSD'06). Cavtat, Croatia: IEEE Press; 2006. p. 53-60. https://doi.org/10.1109/DSD.2006.73
5. Borrmann L., Istavrinos P. Store coherency in a parallel distributed-memory machine. In: Bode A. (ed.) Distributed Memory Computing. EDMCC 1991. Lecture Notes in Computer Science. Vol. 487. Berlin, Heidelberg: Springer; 1991. p. 32-41. https://doi.org/10.1007/BFb0032920
6. Acquaviva J.-T., Jalby W. Experimental analysis of coherency behavior of shared memory scientific applications. In: Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728). San Francisco, CA, USA: IEEE Press; 2000. p. 142-151. https://doi.org/10.1109/MASCOT.2000.876439
7. Szőllősi Á., et al. Litmus test of rich episodic representations: Context-induced false recognition. Cognition. 2023;230:105287. https://doi.org/10.1016/j.cognition.2022.105287
8. Pautet L., Robert T., Tardieu S. Litmus-RT plugins for global static scheduling of mixed criticality systems. Journal of Systems Architecture. 2021;118:102221. https://doi.org/10.1016/j.sysarc.2021.102221
9. Brightwell G., Winkler P. Counting linear extensions is #P-complete. In: Proceedings of the twenty-third annual ACM symposium on Theory of Computing (STOC '91). New York, NY, USA: Association for Computing Machinery; 1991. p. 175-181. https://doi.org/10.1145/103418.103441
10. Farley J.D. Linear extensions of ranked posets, enumerated by descents. A problem of Stanley from the 1981 Banff Conference on Ordered Sets. Advances in Applied Mathematics. 2005;34(2):295-312. https://doi.org/10.1016/j.aam.2004.05.007
11. De Loof K., De Baets B., De Meyer H. On the random generation and counting of weak order extensions of a poset with given class cardinalities. Information Sciences. 2007;177(1):220-230. https://doi.org/10.1016/j.ins.2006.04.003
12. Knuth D. Permutations, matrices, and generalized Young tableaux. Pacific journal of mathematics. 1970;34(3):709-727. https://doi.org/10.2140/pjm.1970.34.709
13. Haiman M.D. Dual equivalence with applications, including a conjecture of Proctor. Discrete Mathematics. 1992;99(1-3):79-113. https://doi.org/10.1016/0012-365x(92)90368-p
14. Robinson G. de B. On the representations of the symmetric group. American Journal of Mathematics. 1938;745-760. https://doi.org/10.2307/2371609
15. Dittmer S., Pak I. Counting Linear Extensions of Restricted Posets. The Electronic Journal of Combinatorics. 2020;27(4):4.48. https://doi.org/10.37236/8552
16. Fava D.S., Steffen M., Stolz V. Operational semantics of a weak memory model with channel synchronization. Journal of Logical and Algebraic Methods in Programming. 2019;103:1-30. https://doi.org/10.1016/j.jlamp.2018.10.004
17. Dan A., et al. Effective abstractions for verification under relaxed memory models. Computer Languages, Systems & Structures. 2017;47:62-76. https://doi.org/10.1016/j.cl.2016.02.003
18. Marvik O.A. A method for IC layout verification. In: Proceedings of the 21st Design Automation Conference (DAC '84). Albuquerque, NM, USA: IEEE Press; 1984. p. 708-709. https://doi.org/10.1109/dac.1984.1585890
19. Winter K., Smith G., Derrick J. Modelling concurrent objects running on the TSO and ARMv8 memory models. Science of Computer Programming. 2019;184:102308. https://doi.org/10.1016/j.scico.2019.102308
20. Cohen E., Schirmer B. From Total Store Order to Sequential Consistency: A Practical Reduction Theorem. In: Kaufmann M., Paulson L.C. (eds.) Interactive Theorem Proving. ITP 2010. Lecture Notes in Computer Science. Vol. 6172. Berlin, Heidelberg: Springer; 2010. p. 403-418. https://doi.org/10.1007/978-3-642-14052-5_28
21. Mador-Haim S., Alur R., Martin M.M.K. Litmus tests for comparing memory consistency models: how long do they need to be? In: Proceedings of the 48th Design Automation Conference (DAC '11). New York, NY, USA: Association for Computing Machinery; 2011. p. 504-509. https://doi.org/10.1145/2024724.2024842
22. Jiménez E., Fernández A., Cholvi V. A parametrized algorithm that implements sequential, causal, and cache memory consistencies. Journal of Systems and Software. 2008;81(1):120-131. https://doi.org/10.1016/j.jss.2007.03.012
23. Wrenger L., Töllner D., Lohmann D. Analyzing the memory ordering models of the Apple M1. Journal of Systems Architecture. 2024;149:103102. https://doi.org/10.1016/j.sysarc.2024.103102
24. Tabbakh A., Annavaram M. An efficient sequential consistency implementation with dynamic race detection for GPUs. Journal of Parallel and Distributed Computing. 2024;187:104836. https://doi.org/10.1016/j.jpdc.2023.104836
25. Gopalakrishnan K., Ravi D. Anvil: Best in Class Multiprocessor Coherency Verification Tool. In: 2017 18th International Workshop on Microprocessor and SOC Test and Verification (MTV). Austin, TX, USA: IEEE Press; 2017. p. 21-25. https://doi.org/10.1109/mtv.2017.18

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing.
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.