Abstract
To improve prediction performance and reduce artifacts in Raman spectra, we developed an eXtreme Gradient Boosting (XGBoost) preprocessing method to preprocess the Raman spectra of glucose, glycerol and ethanol mixtures. To ensure the robustness and reliability of the XGBoost preprocessing method, an X-LR model (which combined XGBoost preprocessing and a linear regression (LR) model) and a X-MLP model (which combined XGBoost preprocessing and a multilayer perceptron (MLP) model) were developed. These two models were used to quantitatively analyze the concentrations of glucose, glycerol and ethanol in the Raman spectra of mixed solutions. The proportion map of hyperparameters was firstly used to narrow down the search range of hyperparameters in the X-LR and the X-MLP models. Then the correlation coefficients (R2), root mean square of calibration (RMSEC), and root mean square error of prediction (RMSEP) were used to evaluate the models’ performance. Experimental results indicated that the XGBoost preprocessing method achieved higher accuracy and generalization capability, with better performance than those of other preprocessing methods for both LR and MLP models.
Original language | English |
---|---|
Article number | 124917 |
Journal | Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy |
Volume | 323 |
Early online date | 31 Jul 2024 |
DOIs | |
Publication status | Published - 15 Dec 2024 |
Keywords
- Glucose
- Linear regression
- Multilayer perceptron
- Raman spectroscopy
- XGBoost