What-If Tool


The What-If Tool (WIT) takes a pretrained model and then allows you to visualize the effect of changing e.g. classification thresholds or the data points themselves on performance, explainability and fairness metrics.

Many convenient functions for gaining insight in the data set are provided, such as binning on particular features, attribution values, or inference scores, computing partial dependence plots, and typical performance indicators such as a confusion matrix or ROC curve.

Because you can interactively edit the data set or prediction thresholds, you can also answer “what if” questions, which goes beyond typical interactive visualization. For example, what happens to the demographic parity fairness metric when I adjust the classification threshold? And how do the Shapley value for the features change?

The What-If Tool works with python-accessible models in a notebook, and works with most models hosted by TF-serving in Tensorboard.

Explainability

In terms of explainability, WIT hooks into existing feature attribution methods like SHAP and LIME when used in notebook mode.

The following demo shows how to use Shapley values in WIT. As the title of the tool already suggests, WIT also computes the counterfactual for any data point given some distance metric (potentially a custom metric).

Fairness

Supported fairness metrics:

  • Group unaware / individual fairness
  • Group threshold (adjust classification threshold per group to compensate for historical bias against that group)
  • Demographic parity
  • Equal opportunity
  • Equal accuracy

For binary classification models you can use fairness optimization strategies.