Electrical & Computer Engineering Publications

Accelerating Single Iteration Performance of CUDA-Based 3D Reaction-Diffusion Simulations

John K. Holmen
David L. Foster, Kettering UniversityFollow

Document Type

Article

Publication Date

4-1-2014

Publication Title

International Journal of Parallel Programming

Abstract

The most commonly used approach for solving reaction–diffusion systems relies upon stencil computations. Although stencil computations feature low compute intensity, they place high demands on memory bandwidth. Fortunately, GPU computing allows for the heavy reliance of stencil computations on neighboring data points to be exploited to significantly increase simulation speeds by reducing these memory bandwidth demands. Upon reviewing previously published works, a wide-variety of efforts have been made to optimize NVIDIA CUDA-based stencil computations. However, a critical aspect contributing to algorithm performance is commonly glossed over: the halo region loading technique utilized in conjunction with a given spatial blocking technique. This paper presents an in-depth examination of this aspect and the associated single iteration performance impacts when using symmetric, nearest neighbor 19-point stencils. This is accomplished by closely examining how the simulated space is partitioned into thread blocks and the balance between memory accesses, divergence, and computing threads. The resulting optimization strategy for accelerating 3-dimensional reaction–diffusion simulations offers up to 2.45 times speedup for single-precision floating point numbers in reference to GPU-based speedups found within the previously published work that this paper directly extends. In reference to our multithreaded CPU-based implementation, the resulting optimization strategy offers up to 8.69 times speedup for single-precision floating point numbers.

Volume

Issue

First Page

343

Last Page

363

DOI

https://doi.org/10.1007/s10766-013-0251-z

ISSN

0885-7458

Comments

ESSN: 1573-7640

Rights

Recommended Citation

Holmen, John K. and Foster, David L., "Accelerating Single Iteration Performance of CUDA-Based 3D Reaction-Diffusion Simulations" (2014). Electrical & Computer Engineering Publications. 37.
https://digitalcommons.kettering.edu/electricalcomp_eng_facultypubs/37

Link to Full Text

COinS

Digital Commons @ Kettering University

Electrical & Computer Engineering Publications

Accelerating Single Iteration Performance of CUDA-Based 3D Reaction-Diffusion Simulations

Document Type

Publication Date

Publication Title

Abstract

Volume

Issue

First Page

Last Page

DOI

ISSN

Comments

Rights

Recommended Citation

Browse

Search

Author Corner

Library Resources

Digital Commons @ Kettering University

Electrical & Computer Engineering Publications

Accelerating Single Iteration Performance of CUDA-Based 3D Reaction-Diffusion Simulations

Authors

Document Type

Publication Date

Publication Title

Abstract

Volume

Issue

First Page

Last Page

DOI

ISSN

Comments

Rights

Recommended Citation

Share

Browse

Search

Author Corner

Library Resources