The Architecture of a Parallel-Pipeline Data Processing Complex for Heterogeneous Computing Environment
- Authors: Talalaev AA1, Fralenko VP1
-
Affiliations:
- Institute of program systems of the Russian Academy of Science
- Issue: No 3 (2013)
- Pages: 113-117
- Section: Articles
- URL: https://journals.rudn.ru/miph/article/view/8420
Cite item
Full Text
Abstract
A heterogeneous computing environment uses various types of computational units. An example of such environment is a GPU-cluster that contains general-purpose processors (central processing unit, CPU) and graphics processing units for special purposes (GPU). Today’s GPU is already far superior CPU performance and, despite the limitations imposed by developed under the concept of GPGPU-computing (general-purpose graphics processing units), parallel algorithms find their application in solving problems that require intensive computation. Organization of the so-called “GPU-cluster” may be an effective solution that have an acceptable “price/performance” ratio and, that most importantly, an ability to easily scale a computer system performance. There are several types of high-performance algorithms for concurrency that relevant for GPU-cluster too (including a task and data parallelism). In this paper produced an analysis of their applicability as a basis set of parallel-pipeline computations data processing. Investigated a variants of high-performance algorithms building, proposed previously developedsoftware adaptation scheme for a new conditions. Library of GPU-computing algorithms in the first place should have a thread-safe implementation (the code is thread-safe if it functions work correctly with multiple running parallel computing threads). An important and needs attention is the question of competing threads resource sharing. In order to assess theimpact of this factor on the effectiveness of applied problem, we performed an experiment,identifying GPU-cluster competing threads dealing bottlenecks. Have been estimated the effective threshold for increasing the number of processing threads that is expected to a further calculations accelerating.
About the authors
A A Talalaev
Institute of program systems of the Russian Academy of Science
Email: arts@arts.botik.ru
V P Fralenko
Institute of program systems of the Russian Academy of Science
Email: alarmod@pereslavl.ru