Study of the Structure of a Heterogeneous Reconfigurable Computer for Efficient Operation of the Matrix Ant Colony Method
Abstract
The advancement of modern computing systems focuses on increasing the number of computational units and accelerators, including heterogeneous types such as matrix accelerators, MIMD, SIMD, and others. This trend necessitates the modernization of computational algorithms to align with the underlying architecture of the computer. This paper presents a matrix modification of the Ant Colony Optimization (ACO) method for addressing parametric problems. In this context, the ACO method employs a graph data structure where each vertex corresponds to a specific value of a parameter, and the ant agent is tasked with selecting one value for each parameter. This framework allows us to interpret the workings of ACO as matrix operations. The proposed matrix modification, which utilizes an optimized graph structure, achieves speedups ranging from 13 to 22 times compared to the original method when executed on a CPU. Much of this acceleration can be attributed to the capabilities of modern optimizing C++ compilers. Additionally, we introduce an algorithm for implementing matrix ACO on SIMD accelerators and heterogeneous computing systems. Research conducted on an accelerator using NVIDIA CUDA technology demonstrated speedups between 7 and 18 times. To effectively apply ACO in a CUDA environment, it is essential to revise the algorithm, categorize data by memory types, properly organize streams and blocks, and address synchronization and data transfer challenges between the CPU and GPU. Furthermore, we propose an algorithm for heterogeneous computers that performs matrix transformations on the CPU while calculating the path of the ant agent on the GPU, achieving acceleration factors ranging from 24 to 38 times. A theoretical analysis of the efficiency of utilizing heterogeneous matrix ACO on a system comprising MIMD cores and SIMD accelerators has also been conducted. This study includes a novel approach for determining the theoretical limits of algorithmic acceleration and outlines an optimal architecture for a reconfigurable heterogeneous computer.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Publication policy of the journal is based on traditional ethical principles of the Russian scientific periodicals and is built in terms of ethical norms of editors and publishers work stated in Code of Conduct and Best Practice Guidelines for Journal Editors and Code of Conduct for Journal Publishers, developed by the Committee on Publication Ethics (COPE). In the course of publishing editorial board of the journal is led by international rules for copyright protection, statutory regulations of the Russian Federation as well as international standards of publishing. 
Authors publishing articles in this journal agree to the following: They retain copyright and grant the journal right of first publication of the work, which is automatically licensed under the Creative Commons Attribution License (CC BY license). Users can use, reuse and build upon the material published in this journal provided that such uses are fully attributed.
 
							 
				 
							 
								 
								 
								 
								 
								 
								 
								 
								 
								 
								