TY - JOUR
T1 - A CUDA-based GPU engine for gprMax: open source FDTD electromagnetic simulation software
AU - Warren, Craig
AU - Giannopoulos, Antonios
AU - Gray, Alan
AU - Giannakis, Iraklis
AU - Patterson, Alan
AU - Wetter, Laura
AU - Hamrah, Andre
PY - 2019/4/1
Y1 - 2019/4/1
N2 - The Finite-Difference Time-Domain (FDTD) method is a popular numerical modelling technique in computational electromagnetics. The volumetric nature of the FDTD technique means simulations often require extensive computational resources (both processing time and memory). The simulation of Ground Penetrating Radar (GPR) is one such challenge, where the GPR transducer, subsurface/structure, and targets must all be included in the model, and must all be adequately discretised. Additionally, forward simulations of GPR can necessitate hundreds of models with different geometries (A-scans) to be executed. This is exacerbated by an order of magnitude when solving the inverse GPR problem or when using forward models to train machine learning algorithms. We have developed one of the first open source GPU-accelerated FDTD solvers specifically focused on modelling GPR. We designed optimal kernels for GPU execution using NVIDIA's CUDA framework. Our GPU solver achieved performance throughputs of up to 1194 Mcells/s and 3405 Mcells/s on NVIDIA Kepler and Pascal architectures, respectively. This is up to 30 times faster than the parallelised (OpenMP) CPU solver can achieve on a commonly-used desktop CPU (Intel Core i7-4790K). We found the cost–performance benefit of the NVIDIA GeForce-series Pascal-based GPUs – targeted towards the gaming market – to be especially notable, potentially allowing many individuals to benefit from this work using commodity workstations. We also note that the equivalent Tesla-series P100 GPU – targeted towards data-centre usage – demonstrates significant overall performance advantages due to its use of high-bandwidth memory. The performance benefits of our GPU-accelerated solver were demonstrated in a GPR environment by running a large-scale, realistic (including dispersive media, rough surface topography, and detailed antenna model) simulation of a buried anti-personnel landmine scenario. New version program summary: Program Title: gprMax Program Files doi: http://dx.doi.org/10.17632/kjjm4z87nj.1 Licensing provisions: GPLv3 Programming language: Python, Cython, CUDA Journal reference of previous version: Comput. Phys. Comm., 209 (2016), 163–170 Does the new version supersede the previous version?: Yes Reasons for the new version: Performance improvements due to implementation of CUDA-based GPU engine Summary of revisions: A FDTD solver has been written in CUDA for execution on NVIDIA GPUs. This is in addition to the existing FDTD solver which has been parallelised using Cython/OpenMP for running on CPUs. Nature of problem: Classical electrodynamics Solution method: Finite-Difference Time-Domain (FDTD)
AB - The Finite-Difference Time-Domain (FDTD) method is a popular numerical modelling technique in computational electromagnetics. The volumetric nature of the FDTD technique means simulations often require extensive computational resources (both processing time and memory). The simulation of Ground Penetrating Radar (GPR) is one such challenge, where the GPR transducer, subsurface/structure, and targets must all be included in the model, and must all be adequately discretised. Additionally, forward simulations of GPR can necessitate hundreds of models with different geometries (A-scans) to be executed. This is exacerbated by an order of magnitude when solving the inverse GPR problem or when using forward models to train machine learning algorithms. We have developed one of the first open source GPU-accelerated FDTD solvers specifically focused on modelling GPR. We designed optimal kernels for GPU execution using NVIDIA's CUDA framework. Our GPU solver achieved performance throughputs of up to 1194 Mcells/s and 3405 Mcells/s on NVIDIA Kepler and Pascal architectures, respectively. This is up to 30 times faster than the parallelised (OpenMP) CPU solver can achieve on a commonly-used desktop CPU (Intel Core i7-4790K). We found the cost–performance benefit of the NVIDIA GeForce-series Pascal-based GPUs – targeted towards the gaming market – to be especially notable, potentially allowing many individuals to benefit from this work using commodity workstations. We also note that the equivalent Tesla-series P100 GPU – targeted towards data-centre usage – demonstrates significant overall performance advantages due to its use of high-bandwidth memory. The performance benefits of our GPU-accelerated solver were demonstrated in a GPR environment by running a large-scale, realistic (including dispersive media, rough surface topography, and detailed antenna model) simulation of a buried anti-personnel landmine scenario. New version program summary: Program Title: gprMax Program Files doi: http://dx.doi.org/10.17632/kjjm4z87nj.1 Licensing provisions: GPLv3 Programming language: Python, Cython, CUDA Journal reference of previous version: Comput. Phys. Comm., 209 (2016), 163–170 Does the new version supersede the previous version?: Yes Reasons for the new version: Performance improvements due to implementation of CUDA-based GPU engine Summary of revisions: A FDTD solver has been written in CUDA for execution on NVIDIA GPUs. This is in addition to the existing FDTD solver which has been parallelised using Cython/OpenMP for running on CPUs. Nature of problem: Classical electrodynamics Solution method: Finite-Difference Time-Domain (FDTD)
KW - CUDA
KW - Finite-Difference Time-Domain
KW - GPR
KW - GPGPU
KW - GPU
KW - NVIDIA
U2 - 10.1016/j.cpc.2018.11.007
DO - 10.1016/j.cpc.2018.11.007
M3 - Article
SN - 0010-4655
VL - 237
SP - 208
EP - 218
JO - Computer Physics Communications
JF - Computer Physics Communications
ER -