TY - JOUR
T1 - Explainable AI Evaluation
T2 - A Top-Down Approach for Selecting Optimal Explanations for Black Box Models
AU - Mirzaei, Seyedehroksana
AU - Mao, Hua
AU - Al-Nima, Raid Rafi Omar
AU - Woo, Wai Lok
PY - 2023/12/20
Y1 - 2023/12/20
N2 - Explainable Artificial Intelligence (XAI) evaluation has grown significantly due to its extensive adoption, and the catastrophic consequence of misinterpreting sensitive data, especially in the medical field. However, the multidisciplinary nature of XAI research resulted in diverse scholars possessing significant challenges in designing proper evaluation methods. This paper proposes a novel framework of a three-layered top-down approach on how to arrive at an optimal explainer, accenting the persistent need for consensus in XAI evaluation. This paper also investigates a critical comparative evaluation of explanations in both model agnostic and specific explainers including LIME, SHAP, Anchors, and TabNet, aiming to enhance the adaptability of XAI in a tabular domain. The results demonstrate that TabNet achieved the highest classification recall followed by TabPFN, and XGBoost. Additionally, this paper develops an optimal approach by introducing a novel measure of relative performance loss with emphasis on faithfulness and fidelity of global explanations by quantifying the extent to which a model’s capabilities diminish when eliminating topmost features. This addresses a conspicuous gap in the lack of consensus among researchers regarding how global feature importance impacts classification loss, thereby undermining the trust and correctness of such applications. Finally, a practical use case on medical tabular data is provided to concretely illustrate the findings.
AB - Explainable Artificial Intelligence (XAI) evaluation has grown significantly due to its extensive adoption, and the catastrophic consequence of misinterpreting sensitive data, especially in the medical field. However, the multidisciplinary nature of XAI research resulted in diverse scholars possessing significant challenges in designing proper evaluation methods. This paper proposes a novel framework of a three-layered top-down approach on how to arrive at an optimal explainer, accenting the persistent need for consensus in XAI evaluation. This paper also investigates a critical comparative evaluation of explanations in both model agnostic and specific explainers including LIME, SHAP, Anchors, and TabNet, aiming to enhance the adaptability of XAI in a tabular domain. The results demonstrate that TabNet achieved the highest classification recall followed by TabPFN, and XGBoost. Additionally, this paper develops an optimal approach by introducing a novel measure of relative performance loss with emphasis on faithfulness and fidelity of global explanations by quantifying the extent to which a model’s capabilities diminish when eliminating topmost features. This addresses a conspicuous gap in the lack of consensus among researchers regarding how global feature importance impacts classification loss, thereby undermining the trust and correctness of such applications. Finally, a practical use case on medical tabular data is provided to concretely illustrate the findings.
KW - XAI
KW - XAI evaluation
KW - black boxes
KW - explainability
KW - explanation criteria
KW - interpretability
UR - https://www.scopus.com/pages/publications/85183077171
U2 - 10.3390/info15010004
DO - 10.3390/info15010004
M3 - Article
SN - 2078-2489
VL - 15
JO - Information
JF - Information
IS - 1
M1 - 4
ER -