The consultation series will consist of chapters, outlining the ICO's interpretation of specific requirements of the UK GDPR and Part 2 of the DPA 2018 in relation to generative AI.
Purpose of the consultation
Through engagement with innovators and stakeholders, the ICO has recognised that further clarity on the application of data protection law to generative AI would be welcomed. In particular, the ICO notes that the onset of generative AI has raised new questions, including:
- What is the appropriate lawful basis for training generative AI models?
- How does the purpose limitation principle play out in the context of generative AI development and deployment?
- What are the expectations around complying with the accuracy principle?
- What are the expectations around complying with data subject rights?
Overall, the ICO is keen to quickly address these questions and the data protection risks of generative AI, whilst enabling organisations and the public to reap its benefits.
The first chapter
The ICO has published the first chapter of the consultation series, covering the lawful basis for training generative AI models on web-scraped data (the use of automated software to 'crawl' web pages and extract and store information for further use). In the chapter, the ICO provides a summary of its analysis on this topic as well as the policy position to be consulted on.
As a starting point, the ICO states that developers need to ensure their processing is not in breach of any laws (e.g. data protection, intellectual property, contract law) and has a valid lawful basis under the UK GDPR. Notably, the ICO recognises that legitimate interests can be a lawful basis for using web-scraped personal data to train generative AI models (Article 6(1)(f) UK GDPR). This is provided that the usual three-part legitimate interests test is met (purpose, necessity, balance).
The ICO goes on to set out detailed analysis for each part of the legitimate interests test. It concentrates in particular on the balancing test by, for example, identifying a number of risk mitigations for generative AI developers to consider. These include:
- Having ongoing technical and organisational measures in place.
- Implementing monitoring processes (e.g. API access for monitoring third-parties).
- Having contractual controls over third-parties.
The chapter concludes with the ICO setting out three aspects developers using web-scraped data to train generative AI models need to be able to show:
- A valid and clear interest;
- Careful consideration of the balancing test (particularly looking at meaningful control); and
- Demonstration of how the interest they have identified will be realised, and how the risks to individuals will be meaningfully mitigated, including their access to their information rights.
The first chapter sets out the ICO's emerging thinking on generative AI. It should not be taken as confirmation that this data processing is legally compliant.
The initial consultation on the first chapter will close on 1 March 2024. Responses can be submitted here. Over the course of the next six months, the ICO intends to publish additional chapters for consultation, covering issues such as the accuracy of generative AI outputs.