AI Explainability 360

Values: { explainability }
- Explanation type: { local surrogate global surrogate example-based Shapley value contrastive white box }
Categories: { model-agnostic }
Stage: { preprocessing in-processing post-processing }
Repository: https://github.com/Trusted-AI/AIX360
Tasks: { classification regression }
Input data: { tabular image text }
Licence: Apache License 2.0
Languages: { Python }
Frameworks: { TensorFlow PyTorch scikit-learn }
References:
- Arya et al. - One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques
- AI Explainability 360 Homepage
- Interactive tutorial with different explanations for different audiences

The AI Explainability 360 (AIX360) toolkit is a Python library that offers a wide range of explanation types as well as some explainability metrics. AIX360 offers excellent guidance material, an interactive demo as well as developer tutorials. What’s particularly good about this material is that it stimulates reflection on which type of explanation is appropriate, not only from a technical point of view, but also with respect to the target explainer and explainee.

This library supports the faithfulness metric, which tracks whether features that are important according to an explanation method correlate with improved model performance, and monotonicity, which tracks that important features indeed increase model performance.

TensorFlow, PyTorch and scikit-learn are supported.

Supported Algorithms

This library distinguishes two broad categories of explanations: explanations of the dataset and of the model. The guidance chart for AI Explainability 360 algorithms offers a decision tree for selecting an appropriate algorithm.

For explaining the dataset two methods are supported. ProtoDash finds example instances that are prototypical for the dataset. DIPVAE is a variational autoencoder that learns meaningful latent structures of the dataset.

All other methods are for explaining models and decisions based on model outputs. Several local (post-hoc) explanations are supported, such as SHAP , LIME , contrastive explanation method ( CEM ), as well as the example-based method ProtoDash. Notice that ProtoDash can both be used to represent the dataset in terms of prototypes, as well as for explaining a prediction by providing similar examples with the same outcome decision. This category includes local surrogate models.

ProfWeight is a method to train a global surrogate model instead.

Three methods offer what AIX360 calls direct explanations. In the taxonomy of this project, this is called a white box explanation. Boolean Decision Rules via Column Generation (BRCG) and Generalized Linear Rule Models (GLRM) offer a global model explanation by generating easy to understand rules for the model outputs. A method called Teaching Explanations for Decisions (TED) outputs explanations directly alongside predictions, so this is a local direct explanation.