TY - JOUR
T1 - XGBoost algorithm assisted multi-component quantitative analysis with Raman spectroscopy
AU - Wang, Qiaoyun
AU - Zou, Xin
AU - Chen, Yinji
AU - Zhu, Ziheng
AU - Yan, Chongyue
AU - Shan, Peng
AU - Wang, Shuyu
AU - Fu, Yongqing
PY - 2024/7/31
Y1 - 2024/7/31
N2 - To improve prediction performance and reduce artifacts in Raman spectra, we developed an eXtreme Gradient Boosting (XGBoost) preprocessing method to preprocess the Raman spectra of glucose, glycerol and ethanol mixtures. To ensure the robustness and reliability of the XGBoost preprocessing method, an X-LR model (which combined XGBoost preprocessing and a linear regression (LR) model) and a X-MLP model (which combined XGBoost preprocessing and a multilayer perceptron (MLP) model) were developed. These two models were used to quantitatively analyze the concentrations of glucose, glycerol and ethanol in the Raman spectra of mixed solutions. The proportion map of hyperparameters was firstly used to narrow down the search range of hyperparameters in the X-LR and the X-MLP models. Then the correlation coefficients (R2), root mean square of calibration (RMSEC), and root mean square error of prediction (RMSEP) were used to evaluate the models’ performance. Experimental results indicated that the XGBoost preprocessing method achieved higher accuracy and generalization capability, with better performance than those of other preprocessing methods for both LR and MLP models.
AB - To improve prediction performance and reduce artifacts in Raman spectra, we developed an eXtreme Gradient Boosting (XGBoost) preprocessing method to preprocess the Raman spectra of glucose, glycerol and ethanol mixtures. To ensure the robustness and reliability of the XGBoost preprocessing method, an X-LR model (which combined XGBoost preprocessing and a linear regression (LR) model) and a X-MLP model (which combined XGBoost preprocessing and a multilayer perceptron (MLP) model) were developed. These two models were used to quantitatively analyze the concentrations of glucose, glycerol and ethanol in the Raman spectra of mixed solutions. The proportion map of hyperparameters was firstly used to narrow down the search range of hyperparameters in the X-LR and the X-MLP models. Then the correlation coefficients (R2), root mean square of calibration (RMSEC), and root mean square error of prediction (RMSEP) were used to evaluate the models’ performance. Experimental results indicated that the XGBoost preprocessing method achieved higher accuracy and generalization capability, with better performance than those of other preprocessing methods for both LR and MLP models.
KW - Glucose
KW - Linear regression
KW - Multilayer perceptron
KW - Raman spectroscopy
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85200218447&partnerID=8YFLogxK
U2 - 10.1016/j.saa.2024.124917
DO - 10.1016/j.saa.2024.124917
M3 - Article
AN - SCOPUS:85200218447
SN - 1386-1425
VL - 323
JO - Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy
JF - Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy
M1 - 124917
ER -