Explainable AI and tree-based ensemble models: a comparative study in predicting chemical pulmonary toxicity

Keerthana Jaganathan*, P. R. Geethika, Shanmugam Ramakrishnan, Dhanasekar Sundaram

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Chemical-induced pulmonary toxicity, characterized by adverse respiratory effects from various drugs or chemicals, is increasingly becoming a point of concern for the pharmaceutical and chemical sectors, as well as public health. Traditional toxicity prediction methods are not only expensive but also demand significant time and effort. In response to these challenges, we focus on computational models to identify potential pulmonary toxicants early in the drug development process. Early identification of toxicity not only enhances the safety and efficiency of drugs and chemicals but also helps prevent late-stage drug withdrawals. In this study, we compared various sets of molecular descriptors and fingerprints using Mordred and RDKit software. We systematically employed feature selection techniques to identify the key molecular and structural features that significantly affect the model’s performance. We then applied a variety of tree-based ensemble machine-learning algorithms to build the proposed model, using a tenfold cross-validation methodology to enhance the model’s ability to predict pulmonary toxicity. We subsequently evaluated the proposed model’s performance using both a test set and a separate external validation set to assess reliability. The proposed optimal tree-ensemble model achieved an accuracy of 85.07% during tenfold cross-validation and 86.88% on the test set. Additionally, we applied the SHapley Additive exPlanations (SHAP) approach to gain deeper insights into the crucial molecular features influencing pulmonary toxicity predictions. Thus, the proposed model emerged as a promising tool for the early screening of potential pulmonary toxic compounds, enhancing chemical safety and providing interpretability for the predictions.
Original languageEnglish
Pages (from-to)1-13
Number of pages13
JournalEuropean Physical Journal: Special Topics
Early online date30 Aug 2024
DOIs
Publication statusE-pub ahead of print - 30 Aug 2024

Cite this