DeepLIFT

Values: { explainability }
- Explanation type: { gradient-based }
Categories: { model-specific }
Stage: { in-processing post-processing }
Repository: https://github.com/kundajelab/deeplift
Tasks: { classification segmentation }
Input data: { image text }
Licence: MIT
Frameworks: { TensorFlow Keras }
References:
- Shrikumar et al. - Learning Important Features Through Propagating Activation Differences
- Springenberg et al. - Striving for Simplicity: The All Convolutional Net
- Sundararajan et al. - Gradients of Counterfactuals
- Sundararajan et al. - Axiomatic Attribution for Deep Networks

A brief explanation of the gradient-based interpretability method called DeepLIFT is given by Shrikumar et al. in the abstract of the linked paper:

DeepLIFT (Deep Learning Important FeaTures), a method for decomposing the output prediction of a neural network on a specific input by backpropagating the contributions of all neurons in the network to every feature of the input. DeepLIFT compares the activation of each neuron to its ‘reference activation’ and assigns contribution scores according to the difference. By optionally giving separate consideration to positive and negative contributions, DeepLIFT can also reveal dependencies which are missed by other approaches. Scores can be computed efficiently in a single backward pass.

The linked repository implements the functionality explained in this paper. Other gradient-based interpretation methods are also implemented, including:

gradient * input (equivalent to Layerwise Relevance Propagation in networks using ReLU; see Shrikumar et al. )
guided backprop (see Springenberg et al.)
integrated gradients ( see the two papers from Sundararajan et al. )

DeepLIFT is model-specific because it is designed specifically for deep neural networks, more specifically Keras and TensorFlow models.

The first step to applying DeepLIFT is to construct a new layer for each layer in the original neural network and specify it’s inputs, thereby creating a network that will return importances. For Keras (2.0) models, there is autoconversion functionality as illustrated in the quickstart of the README.

Note that because these layers are extra, existing gradient operators are not overridden. The application of DeepLIFT should thus not affect the predictions of the original network.

A 15-min introduction to DeepLIFT can be found here:

A more extended tutorial can be found here.

Other implementations

The DeepLIFT package does not support all model types, because not all layers have a DeepLIFT equivalent. In the FAQ the author explains that other packages that directly override gradient operators instead support a broader array of architectures. Variants of DeepLIFT are also implemented in other packages.

The SHAP package includes a DeepExplainer that extends DeepLIFT with the concept of Shapley values. In addition to TensorFlow and Keras models, this implementation has preliminary PyTorch support at the moment of writing.
DeepExplain also connects DeepLIFT with Shapley values, and is actually the basis of SHAP’s DeepExplainer
Captum has a DeepLIFT implementation for PyTorch.

The FAQ contains an in-depth discussion of differences in functionality between these DeepLIFT implementations.