20 Sep 2018 Javier Ruiz

Machine learning and the right to explanation in GDPR

This blogpost is a small section of a much larger research report Debates, awareness, and projects about GDPR and data protection. The report complements the launch of the Digital Rights Finder tool delivered by Projects by IF and Open Rights Group. We highlight some of the most interesting and important debates around GDPR (General Data Protection Regulation).

There is some concern about the practical feasibility of implementing the right to explanation in GDPR in the context of complex data processing such as big data, artificial intelligence and machine learning. (See this section of the report for more on debates about the existence of the right to explanation.)

Lilian Edwards and Michael Veale argue that a right to an explanation is not the remedy to harms caused to people by algorithmic decisions. They also argue that the narrowly-defined right to explanation in GDPR of “meaningful information about the logic of processing” is not compatible with how modern machine learning technologies are being developed.

The problems to tackle here are discrimination and fairness. Machine learning systems are designed to discriminate but some forms of discrimination are socially unacceptable and the systems need to be restrained. The general obligation of fairness in data protection provides the basis for the need to have some level of insight into the functioning of algorithms, particularly in profiling.

One of Edwards and Veale’s proposals is to partially remove transparency as a necessary key step towards accountability and redress. They argue that people trying to tackle data protection issues have a desire for an action, not for an explanation. The actual value of an explanation will not be to relieve or redress the emotional or economic damage suffered, but to understand why something happened and helping ensure a mistake doesn’t happen again.

Within this more limited sense, problems remain in defining transparency in the context of algorithmic accountability. For example, providing the source code of algorithms may not be sufficient and may create other problems in terms of privacy disclosures and the gaming of technical systems. They argue that an auditing approach could be more successful instead by looking at the external inputs and outputs of a decision process, rather than at the inner workings: “explaining black boxes without opening them”.

The authors see the right to explanation as providing some grounds for explanations about specific decisions. They present two types of algorithmic explanations that could be provide: model-centric explanations (MCEs) and subject-centric explanations (SCEs), which seem broadly aligned with explanations about either systems or decisions.

SCEs are seen as the best way to provide for some remedy, although with some severe constraints if the data is just too complex. Their proposal is to break down the full model and focus on particular issues through pedagogical explanations to a particular query, “which could be real or could be fictitious or exploratory”. These explanations will necessarily involve trade offs with accuracy to reduce complexity.

Their main concern seems to be to avoid a creating a “transparency fallacy”, where similarly to the “consent fallacy” people regimen an illusion of control that does not exist, instead of being offered practical remedies to stop harmful data practices.

There is growing interest in explanation of technical decision making systems in the field of human-computer interaction design. Practitioners in this field criticise efforts to open the black box in terms of mathematically interpretable models as removed from cognitive science and the actual needs of people. Alternative approaches would be to allow users to explore the system’s behaviour freely through interactive explanations. This is quite similar to the proposals by Edwards and Veale.

A complementary approach has been put forward by Andrew Selbst and Solon Barocas, who argue that the increasing calls for explainability of automated decision making systems rely on an intuitive approach that will not work with machine learning. ML is both inscrutable and non-intuitive. Inscrutability is the back box problem, the inability to understand the inner cogs of a model, but non-intuitiveness means being unable to grasp the rules the model follows, even if we were able to open the box. Accountability requires not only knowledge of the process, but also whether it is justified, or fair.

Selbst and Barocas argue that lawyers and scholars asking for explanations will be disappointed because intuition cannot deal with the truly novel insights produced through machine learning that associate data in patterns that completely escape human logic and imagination.

Their alternative proposal is to focus accountability on the processes around ML models, not the models themselves. Policies and documentation of intent and design choices should be made available, some by default, such as impact assessment, and others in the context of a complaint or regulatory action. This approach chimes with the main thrust of GDPR, which puts accountability at the fore.

In summary, the right to an explanation as defined in GDPR may be harder than expected to implement. This does not invalidate the basic premise that individuals have a right to know what is being done with their data, but – particularly with novel machine learning techniques – it means that we need to look beyond simple calls for transparency.