model-agnostic

Aequitas: Bias and Fairness Audit Toolkit

Audit The Aequitas toolkit can both be used on the command-line, programmatically via its Python API or via a web interface. The web interface offers a four step programme to audit a dataset on bias. The four steps are: Upload (tabular) data Determine protected groups and reference group Select fairness metrics and disparity intolerance Inspect bias report Example audit report. This toolkit is useful for auditing bias and fairness according to a limited set of common fairness metrics, but does not offer algorithms for mitigating bias. Read more...

Agile Ethics for AI

Butnaru and others associated with the HAI center at Stanford set up a Agile Ethics workflow in the form of a Trello board. From left to right, the workflow walks you through relevant ethical considerations at the various steps of a machine learning pipeline. The phases are: Scope Consider ethical implications of the project Consider skill mapping (what’s the impact of AI on jobs)? Facilitates up-skilling or a change of strategy in the use of human talent Data audit Led by Chief Data Officer “Meet and plan” stage in Agile Helpful: Data Ethics Canvas Train Build stage in Agile Consider (tools for) transparency and fairness Analyse Benchmarks, including benchmarks related to e. Read more...

AI Ethics Guidelines Global Inventory

AlgorithmWatch is maintaining a searchable inventory of published frameworks that set out ethical AI values. They can be searched on sector/actor, type, region and location. AlgorithmWatch noted some common patterns here after publishing the first version of the index: “All include the similar principles on transparency, equality/non-discrimination, accountability and safety. Some add additional principles, such as the demand for AI be socially beneficial and protect human rights.” “Most frameworks are developed by coalitions, or institutions such as universities that then invite companies and individuals to sign up to these. Read more...

AI Explainability 360

The AI Explainability 360 (AIX360) toolkit is a Python library that offers a wide range of explanation types as well as some explainability metrics. AIX360 offers excellent guidance material, an interactive demo as well as developer tutorials. What’s particularly good about this material is that it stimulates reflection on which type of explanation is appropriate, not only from a technical point of view, but also with respect to the target explainer and explainee. Read more...

AI Fairness 360

The IBM AI Fairness 360 Toolkit contains several bias mitigation algorithms that are applicable to various stages of the machine learning pipeline. The toolkit implements different notions of fairness, both on individual and the group level, and several fairness metrics for both classes of fairness. The toolkit provides additional guidance on choosing metrics and mitigation algorithms given a particular goal and application. The following should be noted when using the fairness toolkit (and other similar toolkits, for that matter): Read more...

Algorithmic Accountability Policy Toolkit

AI Now published the Algorithmic Accountability Policy Toolkit in 2018. It is specifically tailored towards advocates concerned with government use of algorithms. The toolkit provides a FAQ, an overview of various types of algorithms used by governments in specific application areas such as public health or criminal justice, and a comprehensive list of relevant literature. The fact that this is a toolkit and not a paper can be seen from the very practical guidance that is offered. Read more...

Alibi

Alibi is an open-source Python library that supports various interpretability techniques and a broad array of explanation types. The README already provides an overview of the supported methods and when they are applicable. The following table with supported methods is copied from the README (slightly abbreviated): Supported methods Method Models Explanations Classification Regression Tabular Text Images Categorical features ALE BB global ✔ ✔ ✔ Anchors BB local ✔ ✔ ✔ ✔ ✔ CEM BB* TF/Keras local ✔ ✔ ✔ Counterfactuals BB* TF/Keras local ✔ ✔ ✔ Prototype Counterfactuals BB* TF/Keras local ✔ ✔ ✔ ✔ Integrated Gradients TF/Keras local ✔ ✔ ✔ ✔ ✔ ✔ Kernel SHAP BB local global ✔ ✔ ✔ ✔ Tree SHAP WB local global ✔ ✔ ✔ ✔ The README also explains the keys: Read more...

Alibi Detect

Alibi Detect is an open source Python library (sister library to Alibi ) focused detecting outliers, adversarial examples, and concept drift. Finding adversarial examples is relevant for assessing the security of machine learning models. Machine learning models learn complex statistical patterns in datasets. If these statistical patterns “drift” (in unforeseen ways) after a model is deployed, this will decrease the model performance over time. In systems where model predictions have an impact on people, this may be a threat to the fairness of the predictions. Read more...

ART: Adversial Robustness 360 Toolbox

The Adversial Robustness Toolbox (ART) is the first comprehensive toolbox that unifies many defensive techniques for four categories of adversarial attacks on machine learning models. These categories are model evasion, model poisoning, model extraction and inference (e.g. inference of sensitive attributes in the training data; or determining whether an example was part of the training data). ART supports all popular machine learning frameworks, all data types and all machine learning tasks. Read more...

Contrastive Explanation Method (CEM)

Dhurandhar et al. support a type of contrastive explanation based on what they call pertinent negatives. A contrastive explanation answers the question: “Why P, rather than Q”? CEM supports such an explanation by finding the minimal set of features that lead to prediction P (a pertinent positive that resembles an anchor explanation), and additionally a minimal set of features that should be absent to maintain decision P instead of the decision for closest class Q (a pertinent negative that is somewhat similar to a counterfactual ). Read more...

Data Ethics Canvas

The Data Ethics Canvas is a tool developed by the Open Data Institute for providing ethical guidance to organizations doing any type of project involving data. That includes data collection, sharing, and its usage for example in machine learning applications. The tool is accompanied with a white paper and a brief practical guide for its usage. Page 3 of the practical guide lists some recommendations that are also relevant when you do not use this tool. Read more...

Data Nutrition Label

In analogy with nutrition labels on food products, the authors of this paper propose a way to create a Data Nutrition Label. The goal of this method is to asses data quality and mitigate potential problems early on before building models on the data. According to the authors, their approach is different from the datasheet in that the “proposed datasheet [i.e. by Gebru et al.] includes dataset provenance, key characteristics, relevant regulations and test results, but also significant yet more subjective information such as potential bias, strengths and weaknesses of the dataset, API, or model, and suggested uses. Read more...

Data Statements for NLP

A data statement, according to the authors, is … a characterization of a dataset that provides context to allow developers and users to better understand how experimental results might generalize,how software might be appropriately deployed,and what biases might be reflected in systems built on the software. (587) This paper specifically focuses on ethically responsive NLP technology. The authors argue that a data statement should be an integral part of work and writing on NLP. Read more...

Datasheets for Datasets

The method described in this paper aids in documenting datasets to help avoid unwanted consequences of data usage. Abstract: The machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains. To address this gap, we propose datasheets for datasets. In the electronics industry, every component, no matter how simple or complex, is accompanied with a datasheet that describes its operating characteristics, test results, recommended uses, and other information. Read more...

DEDA: De Ethische Data Assistent

This toolkit developed by the Utrecht Data School supports data analysts, projectmanagers, and policy makers in identifying ethical values and issues in data projects and promoting accountability towards stakeholders. The toolkit is written in Dutch and includes a poster to support brainstorm sessions, an interactive survey, and an accompanying guide with further explanations. On the toolkit’s website you can also find several case studies that highlight ethical issues in data projects, as well as a version of the toolkit specifically for researchers. Read more...

DiCE: Diverse Counterfactual Explanations

From README: DiCE implements counterfactual (CF) explanations that provide this information by showing feature-perturbed versions of the same person who would have received the loan, e.g., you would have received the loan if your income was higher by $10,000. In other words, it provides “what-if” explanations for model output and can be a useful complement to other explanation methods, both for end-users and model developers. A main innovation of DiCE is that it implements a method to make producing counter-factual examples more model-agnostic: Read more...

ELI5

ELI5 (“Explain Like I’m 5”) provides model-specific support for models from scikit-learn, lightning, decision tree ensembles using the xgboost, LightGBM, CatBoost libraries. ELI5 mainly provides convenient wrappers to couple the feature importance coefficients that these libraries already provide with feature names, as well as convenient ways to visualize importances, e.g. by highlighting words in a text. For Keras image classifiers an implementation of the gradient-based Grad-CAM visualizations is offered, but the TensorFlow V2 backend is not supported. Read more...

Equity Evaluation Corpus (EEC)

This handcrafted dataset can be used to evaluate bias in AI using text data for NLP tasks. Dataset description: Automatic machine learning systems can inadvertently accentuate and perpetuate inappropriate human biases. Past work on examining inappropriate biases has largely focused on just individual systems and resources. Further, there is a lack of benchmark datasets for examining inappropriate biases in system predictions. Here, we present the Equity Evaluation Corpus (EEC), which consists of 8,640 English sentences carefully chosen to tease out biases towards certain races and genders. Read more...

FactSheets: Increasing Trust in AI Services through Supplier's Declaration of Conformity

FactSheets, as proposed by Arnold et al. from IBM Research, are similar to the model-card and datasheet , but are significantly more comprehensive because they focus on whole AI services. A main difference is that live AI services may comprise several trained models that interact with each other. The model card and datasheet instead concern a single model and the data it is trained on. However, Arnold et al. point out that an AI service with safe components is not necessarily safe overall and “so it is prudent to also consider transparency and accountability of services in addition to datasets and models” (p. Read more...

Fairlearn

The documentation of fairlearn is excellent and provides a good introduction to the topic of fairness in AI. It is emphasized that fairness algorithms are no plug-and-play technical solutions, but require serious thought about the context of the data and the problem at hand. Fairness is a fundamentally sociotechnical challenge and cannot be solved with technical tools alone. They may be helpful for certain tasks such as assessing unfairness through various metrics, or to mitigate observed unfairness when training a model. Read more...

Fairness Decision Tree

This fairness tree is shown in the web version of the Aequitas bias and fairness audit toolkit. It’s main purpose is to help decide on a suitable fairness metric, given the data set and the type of problem. Because this can also be useful to be used with other fairness toolkits, it merited its own entry.

H2O MLI Resources

This repository by H2O.ai contains useful resources and notebooks that showcase well-known machine learning interpretability techniques. The examples use the h2o Python package with their own estimators (e.g. their own fork of XGBoost), but all code is open-source and the examples are still illustrative of the interpretability techniques. These case studies that also deal with practical coding issues and preprocessing steps, e.g. that LIME can be unstable when there are strong correlations between input variables. Read more...

InterpretML

The InterpretML toolkit, developed at Microsoft, can be decomposed in two major components: A set of interpretable “glassbox” models Techniques for explaining black box systems. W.r.t. 1, InterpretML particularly contains a new interpretable “glassbox” model that combines Generalized Additive Models (GAMs) with machine learning techniques such as gradient boosted trees, called an Explainable Boosting Machine. Other than this new interpretable model, the main utility of InterpretML is to unify existing explainability techniques under a single API. Read more...

LIME: Local Interpretable Model-agnostic Explanations

The type of explanation LIME offers is a surrogate model that approximates a black box prediction locally. The surrogate model is a sparse linear model, which means that the surrogate model is interpretable (in this case, it’s weights are meaningful). This simpler model can thus help to explain the black box prediction, assuming the local approximation is actually sufficiently representative. The intuition behind this is provided in the README: Intuitively, an explanation is a local linear approximation of the model’s behaviour. Read more...

Model cards for Model Reporting

Model cards are an extension of the datasheet to machine learning models. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type [15]) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Read more...

SHAP: SHapley Additive exPlanations

The SHAP package is built on the concept of a Shapley value and can generate explanations model-agnostically. So it only requires input and output values, not model internals: SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions. (README) Additionally, this package also contains several model-specific implementations of Shapley values that are optimized for a particular machine learning model and sometimes even for a particular library. Read more...

SMACTR: End-to-End Framework for Internal Algorithmic Auditing

Introduction A major downside of external auditing is that it typically only can be done after model deployment. This paper presents a methodology for internal algorithmic auditing as an integral part of the development process, end-to-end. Those who move fast and break things, beware: The audit process is necessarily boring, slow, meticulous and methodical—antithetical to the typical rapid development pace for AI technology. However, it is critical to slow down as algorithms continue to be deployed in increasingly high-stakes domains. Read more...

What-If Tool

The What-If Tool (WIT) takes a pretrained model and then allows you to visualize the effect of changing e.g. classification thresholds or the data points themselves on performance, explainability and fairness metrics. Many convenient functions for gaining insight in the data set are provided, such as binning on particular features, attribution values, or inference scores, computing partial dependence plots, and typical performance indicators such as a confusion matrix or ROC curve. Read more...

XAI Toolbox

This library is a small toolbox that offers some convenience functions for quickly visualizing imbalances in the data set, computing (permutation) feature importances and metrics such as the ROC-curve. A function to balance the data is offered through basic up- or downsampling, but other than this no fairness criteria are defined. Compared to other libraries the XAI Toolbox is very basic and currently the roadmap (which is not updated since 2019) does not include any major improvements. Read more...