Abstract
Lack of well-labelled and coherent training data is the main reason why machine learning (ML) and data-driven interpretations are not established in the field of Ground-Penetrating Radar (GPR). Non-representative and limited datasets lead to non-reliable ML-schemes that overfit, and are unable to compete with traditional deterministic approaches. To that extent, numerical data can potentially complement insufficient measured datasets and overcome this lack of data, even in the presence of large feature spaces.
Using synthetic data in ML is not new and it has been extensively applied to computer vision. Applying numerical data in ML requires a numerical framework capable of generating synthetic but nonetheless realistic datasets. Regarding GPR, such a framework is possible using gprMax, an open source electromagnetic solver, fine-tuned for GPR applications [1], [2], [3]. gprMax is fully parallelised and can be run using multiple CPU’s and GPU’s. In addition, it has a flexible scriptable format that makes it easy to generate big data in a trivial manner. Stochastic geometries, realistic soils, vegetation, targets [3] and models of commercial antennas [4], [5] are some of the features that can be easily incorporated in the training data.
The capability of gprMax to generate realistic numerical datasets is demonstrated in [6], [7]. The investigated problem is assessing the depth and the diameter of rebars in reinforced concrete. Estimating the diameter of rebars using GPR is particularly challenging with no conclusive solution. Using a synthetic training set, generated using gprMax, we managed to effectively train ML-schemes capable of estimating the diameter of rebar in an accurate and efficient manner [6], [7]. The aforementioned case studies support the premise that gprMax has the potential to provide realistic training data to applications where well-labelled data are not available, such as landmine detection, non-destructive testing and planetary sciences.
Using synthetic data in ML is not new and it has been extensively applied to computer vision. Applying numerical data in ML requires a numerical framework capable of generating synthetic but nonetheless realistic datasets. Regarding GPR, such a framework is possible using gprMax, an open source electromagnetic solver, fine-tuned for GPR applications [1], [2], [3]. gprMax is fully parallelised and can be run using multiple CPU’s and GPU’s. In addition, it has a flexible scriptable format that makes it easy to generate big data in a trivial manner. Stochastic geometries, realistic soils, vegetation, targets [3] and models of commercial antennas [4], [5] are some of the features that can be easily incorporated in the training data.
The capability of gprMax to generate realistic numerical datasets is demonstrated in [6], [7]. The investigated problem is assessing the depth and the diameter of rebars in reinforced concrete. Estimating the diameter of rebars using GPR is particularly challenging with no conclusive solution. Using a synthetic training set, generated using gprMax, we managed to effectively train ML-schemes capable of estimating the diameter of rebar in an accurate and efficient manner [6], [7]. The aforementioned case studies support the premise that gprMax has the potential to provide realistic training data to applications where well-labelled data are not available, such as landmine detection, non-destructive testing and planetary sciences.
Original language | English |
---|---|
Number of pages | 2 |
DOIs | |
Publication status | Published - 19 Apr 2021 |
Event | EGU General Assembly 2021: Gather Online - online Duration: 19 Apr 2021 → 30 Apr 2021 https://www.egu21.eu |
Conference
Conference | EGU General Assembly 2021 |
---|---|
Abbreviated title | vEGU21 |
Period | 19/04/21 → 30/04/21 |
Internet address |