Prediction of Drug-Induced Liver Toxicity Using SVM and Optimal Descriptor Sets

Keerthana Jaganathan, Hilal Tayara*, Kil To Chong*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Citations (Scopus)
2 Downloads (Pure)


Drug-induced liver toxicity is one of the significant safety challenges for the patient’s health and the pharmaceutical industry. It causes termination of drug candidates in clinical trials and also the retractions of approved drugs from the market. Thus, it is essential to identify hepatotoxic compounds in the initial stages of drug development process. The purpose of this study is to construct quantitative structure activity relationship models using machine learning algorithms and systematical feature selection methods for molecular descriptor sets. The models were built from a large and diverse set of 1253 drug compounds and were validated internally with 10-fold cross-validation. In this study, we applied a variety of feature selection techniques to extract the optimal subset of descriptors as modeling features to improve the prediction performance. Experimental results suggested that the support vector machine-based classifier had achieved a better classification accuracy with reduced molecular descriptors. The final optimal model provides an accuracy of 0.811, a sensitivity of 0.840, a specificity of 0.783 and Mathew’s correlation coefficient of 0.623 with an internal validation set. Furthermore, this model outperformed the prior studies while evaluated in both the internal and external test sets. The utilization of distinct optimal molecular descriptors as modeling features produce an in silico model with a superior performance.
Original languageEnglish
Article number8073
Number of pages17
JournalInternational Journal of Molecular Sciences
Issue number15
Publication statusPublished - 28 Jul 2021
Externally publishedYes

Cite this