River water quality index prediction and uncertainty analysis: A comparative study of machine learning models

Seyed Babak Haji Seyed Asadollah*, Ahmad Sharafati*, Davide Motta*, Zaher Mundher Yaseen*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    322 Citations (Scopus)

    Abstract

    The Water Quality Index (WQI) is the most common indicator to characterize surface water quality. This study introduces a new ensemble machine learning model called Extra Tree Regression (ETR) for predicting monthly WQI values at the Lam Tsuen River in Hong Kong. The ETR model performance is compared with that of the classic standalone models, Support Vector Regression (SVR) and Decision Tree Regression (DTR). The monthly input water quality data including Biochemical Oxygen Demand (BOD), Chemical Oxygen Demand (COD), Dissolved Oxygen (DO), Electrical Conductivity (EC), Nitrate-Nitrogen ( NO 3 -N), Nitrite-Nitrogen ( NO 2 -N), Phosphate ( P O 4 3 - ), potential for Hydrogen (pH), Temperature (T) and Turbidity (TUR) are used for building the prediction models. Various input data combinations are investigated and assessed in terms of prediction performance, using numerical indices and graphical comparisons. The analysis shows that the ETR model generally produces more accurate WQI predictions for both training and testing phases. Although including all the ten input variables achieves the highest prediction performance ( R 2 t e s t = 0.98 , R M S E t e s t = 2.99 ), a combination of input parameters including only BOD, Turbidity and Phosphate concentration provides the second highest prediction accuracy ( R 2 t e s t = 0.97 , R M S E t e s t = 3.74 ). The uncertainty analysis relative to model structure and input parameters highlights a higher sensitivity of the prediction results to the former. In general, the ETR model represents an improvement on previous approaches for WQI prediction, in terms of prediction performance and reduction in the number of input parameters.
    Original languageEnglish
    Article number104599
    Number of pages14
    JournalJournal of Environmental Chemical Engineering
    Volume9
    Issue number1
    Early online date18 Oct 2020
    DOIs
    Publication statusPublished - 1 Feb 2021

    Keywords

    • Water quality index
    • River water quality
    • Lam Tsuen river
    • Ensemble machine learning

    Fingerprint

    Dive into the research topics of 'River water quality index prediction and uncertainty analysis: A comparative study of machine learning models'. Together they form a unique fingerprint.

    Cite this