The last few years many companies, institutions and governmental organizations have published documents stating principles for the responsible development of Artificial Intelligence. These documents usually refer to the following common set of values to uphold:


Unlike explainability or fairness , most available tools for accountability are not technical solutions, but rather methods describing best practices. For the sake of simplicity tools for organizational transparency, as opposed to algorithmic interpretability, are also categorized under accountability.


Explainability is instrumental for maintaining other values such as fairness and for trust in AI systems. There is little consensus about what “explainability” precisely is. The related concepts of “transparency” and “interpretability” are sometimes used as synonyms, sometimes distinctly. For example, the explainability of machine learning models can be seen as one aspect of the overall need to be transparent in the use of AI (so transparency is the superconcept). But one may also use the word “transparency” to indicate “white box” models that are in themselves interpretable. Read more...


When trying to operationalize fairness it is important to realize that fairness in machine learning is a complex socio-technical issue. At minimum, this means that fairness tools should never be seen as plug-and-play solutions. This is already evident from the fact that - as most of the listed tools will emphasize - choices have to be made in which type of fairness is strived for. One general distinction for example is between group fairness and individual fairness. Read more...


Most AI applications nowadays rely on large amounts of data, and regulations like the European Data Protection Regulation (GDPR) have to be taken into account. Tools for privacy in AI can for example be practical guidelines for implementing privacy by design into AI applications, or technical methods to prevent extraction of privacy-sensitive information from trained models.


Security focuses on tools for mitigating AI-specific risks, such as tools that promote robustness against adversarial attacks. For example, it is possible to create adversarial examples by finding perturbations that maximize the prediction error and then applying these perturbations to input images in such a way that 1) a human does not or barely see the difference but 2) the neural network misclassifies the input. If the classifications of the neural network have impact, then this type of attack needs to be countered for safe usage of the model. Read more...