UN WG on business and human rights' report on AI procurement -- key findings and recommendations

June 23, 2025

Last week, the UN working group on business and human rights officially presented its thematic report on the procurement and deployment of artificial intelligence systems by States and businesses (A/HRC/59/53, 14 May 2025 — note there is also an executive summary infographic).

The report focuses on actions to be taken to facilitate alignment of AI procurement and deployment with the UN’s Guiding Principles on Business and Human Rights and addresses organisations procuring rather than developing AI. The report approaches procurement in broad terms by encompassing both public and private procurement, and by taking into account the position and responsibilities of States, business and stakeholders. The report contains a series of findings and recommendations.

findings on the Regulatory landscape

One of the report’s key findings is that ‘States are increasingly shifting from voluntary guidelines to binding legislation on AI and human rights, such as through the European Union AI Act and Council of Europe AI Convention. However, there are significant gaps in terms of rights-respecting procurement and deployment of AI systems, including a lack of a human rights-based approach, no consensus on key definitions, insufficient integration of the perspective of the Global South, the provision of broad exceptions and limited involvement of civil society. Further, enforcement gaps and loopholes are weakening human rights protections in existing legislation on AI and human rights.’ This requires a closer look.

The report highlights that ‘Globally, there are over 1,000 AI-related standards and over 50 AI governance initiatives based on ethics, responsibility or safety principles’. Although unsurprising, I find this interesting and speaks to fragmentation and duplication of regulatory efforts that create a complex landscape. Given the repeated recognition that AI challenges transcend borders and the calls for international collaboration (eg here and here), there is clearly a gap still to be addressed.

In that regard, the report stresses that ‘The lack of consensus on key concepts such as “AI” and “ethics” is leading to inconsistencies in the regulation of AI systems and is particularly problematic given the transnational nature of AI’, and highlights UNESCO’s Recommendation on the Ethics of Artificial Intelligence as the sort of document that could be used as a blueprint to promote policy coherence across jurisdictions.

Although the report identifies a recent shift from voluntary guidelines to legally binding rules for AI systems, such as the EU AI Act or the Council of Europe Framework Convention on AI, it also highlights that ‘there is still uncertainty regarding how to address certain loopholes in the EU AI Act’ and that the Framework Convention creates similar challenges in relation to the significant exemptions it contains, and the way it gives signatory States discretion to set its scope of application. Although the report does not take an explicit position on this, I think it takes a small step to conclude that legislative action needs to be far more decisive if the challenge of upholding human rights and fundamental values in AI deployment is to be met.

Another key finding of the report is that ‘States are largely procuring and deploying AI systems without adequate safeguards, such as conducting human rights impact assessments as part of human rights due diligence (HRDD), leading to human rights impacts across the public sector, including in relation to healthcare, social protection, financial services, taxation, and others.’ This results from the limited emerging approaches to AI procurement.

Indeed, focusing on the regulation of AI public procurement, the report highlights a series of approaches to developing legally binding general requirements for AI procurement and deployment, such as in Korea, Chile, California, Lithuania or Rwanda, as well as efforts in other jurisdictions to tackle specific aspects of AI deployment. However, the report also stresses that those regimes tend to have exemptions in relation to the most controversial and potentially harmful areas for AI deployment (such as defence and intelligence), and that the practical implementation of those regimes still hinges on the limited development of commonly understood standards and guardrails and, crucially, on public sector digital skills.

On the latter, the report clearly puts it that ‘Currently, there is an imbalance in knowledge and expertise between States and the private sector around what AI is, how it works and what outcomes it produces. There is also little space and time for procurers to engage critically with the claims made by AI vendors or suppliers, including as they relate to potential and actual human rights impacts.’ Again, this is unsurprising, but this renewed call for investment in capacity-building should make it abundantly clear that with insufficient state capacity there can be no effective regulation of AI procurement or deployment across the public sector (because, ultimately, as we have recently argued procurement is the infrastructure on which this regulatory approach rests).

The report then covers in detail business responsibility in relation to AI procurement and deployment and covers issues of relevance even in contexts of light-touch self-regulation, such as due diligence, contextual impact assessments, or stakeholder involvement. Similarly, the report finds that ‘Businesses are largerly procuring and deploying AI systems without conducting HRDD, risking adverse human rights impacts such as biased decision making, exploitative worker surveillance, or manipulation of consumer behavior.’

The final part of the report covers access to remedies and, in another of its key findings, stresses that ‘Courts are increasingly recognizing the human rights-related concerns of AI procurement and deployment, highlighting the urgent need for transparency and public disclosure for public and private sector procurement and deployment of AI systems, and the fact that existing remedy mechanisms lack resources and enforcement power, leaving communities without effective recourse for AI-related human rights abuses. Stronger legal frameworks, public reporting obligations, and independent oversight bodies are needed to ensure transparency, accountability and redress.’

The report thus makes the primary point that much increased transparency on AI deployment is required, so that existing remedies can be effectively used by those affected and concerned. It also highlights how existing remedies may be insufficient and, in particular, new ‘mechanisms will also need to be set up, creating integrated approaches that recognize the intersectional nature of AI-related harms and their disproportionate impact on at-risk groups. Effective redress for AI-related harms requires both strong institutional frameworks and deep understanding of how technology intersects with existing patterns of human rights violation and abuses, both of which are currently missing’ (this largely chimes with my view that we need a dedicated authority to oversee public sector AI use, and that preventative approaches need to be explored given the risks of mass harms arising from AI deployment).

recommendations

In order to address the unsatisfactory state of affairs document in the report, the working group formulates a log list of recommendations to States, businesses and other actors. In the executive summary, the following are highlighted as key recommendations to States.

Establish robust legal, regulatory and policy frameworks on AI: Develop and implement AI regulations following a human rights-based approach that are aligned with international human rights law, ensuring transparency and accountability in AI procurement and deployment and legal certainty for all.
Mandate HRDD: Require public disclosure, HRDD, and safeguards for AI systems procured and deployed by private and public sector actors, including AI systems used in high-risk sectors like law enforcement, migration management, and social protection.
Prohibit Harmful AI Systems: Ban AI technologies incompatible with human rights, like mass surveillance, remote real-time facial recognition, social scoring and predictive policing.
Ensure Access to Remedy: Strengthen judicial and non-judicial mechanisms to address AI-related human rights abuses, shifting the burden of proof to businesses and authorities, and ensuring adequate resources.
Promote AI Governance Collaboration: Build global cooperation to establish common AI standards, fostering interoperability and ensuring the representation of Global South perspectives.

However, it is worth bringing up other recommendations included in the much longer list in the report, as some of them are directly relevant to the specific task of AI procurement. In that regard, the report also recommends that, with regard to AI procurement and deployment, States:

Provide specific guidance to public sector procurement actors on a human-rights based approach to the procurement of AI systems; including specific limitations, guidance and safeguards for AI systems procured and deployed in high-risk sectors and areas such as justice, law enforcement, migration, border control, social protection and financial services, and in conflict-affected areas;
Provide capacity-building for all stakeholders to understand the technical and human rights dimensions of AI, and ensure accessible, explainable and understandable information about the procurement and deployment of AI systems, including by mandating public registration of AI systems deployed by both public and private entities;
Ensure independent oversight of AI systems and require the provision of clear documentation on AI system capabilities, limitations and data provenance;
Promote meaningful stakeholder consultation and participation in decision-making processes around AI procurement and deployment;

These recommendations will resonate with the maim requirements (in principle) applicable under eg the EU AI Act, or proposals for best practice AI procurement.

Final comment

The report helpfully highlights the current state of affairs in the regulation of AI procurement and deployment across the public and private sectors. The issues it raises are well-known and many of them involve complex governance challenges, including the need for levels of public investment commensurate to the socio-technical challenges brought by the digitalisation of the public sector and key private market services.

The report also highlights that, in the absence of adequate regulatory interventions, States (and businesses) are creating a significant stack of AI deployments that are simply not assured for relevant risks and, consequently, are creating an installed base of potentially problematic AI embeddings across the public sector and business. If anything, I think this should be a call for a renewed emphasis on slowing down AI adoption to allow for the development of the required governance instruments.

Procurement in the EU's AI Continent Action Plan

April 9, 2025

The EU has published its ‘AI Continent Action Plan’ (COM(2025)165).

The Plan aims to enhance the EU’s AI capabilities by promoting initiatives around five key areas. One of those key areas concerns the promotion of AI in strategic sectors and, in particular, in the public sector and healthcare.

The Plan includes some high level initiatives that are, however, not new.

The Plan refreshes the expectation for the public sector to provide a source of funding and experimentation for AI development: ‘EU public procurement, accounting for over 15% of our GDP, could create an enormous market for innovative products and services.’ This has been a long-standing aspiration (eg Fostering a European approach to Artificial Intelligence, COM(2021)205).
In that regard, the Plan reiterates the goal of the Competitiveness Compass to promote ‘European preference in public procurement for critical sectors and technologies in the context of the forthcoming review of the EU rules’, and clearly places AI amongst them. We will have to wait for details, but the compatibility of an EU preference with international procurement law escapes me.
The Plan also refers to the upcoming ‘Apply AI Strategy’ which should ‘address adoption by the public sector, where AI in areas like healthcare can bring transformative benefits to wellbeing’.
The Plan also includes a reference to:
a call for funding of up to four pilot projects aimed at accelerating the deployment of European generative AI solutions in public administrations; and
the fact that the ‘GovTech Incubator initiative will, over the period 2025-2029, support 21 GovTech actors from 16 countries to co-pilot and develop, as a first step, AI solutions for public procurement, evidence processing and accessibility assistants.’
Overall, while it is interesting to see procurement being highlighted as part of the Plan, it seems that the Plan is not at the right scale to promote the sort of system-level change required for extensive adoption of AI in the public sector (at Member State level).
What is more, without a clear strategy on how to address the issues of digital skills within the public sector, and without specific practical tools or guidance on how to procure AI (and the model EU clauses are definitely not an adequate tool, see here), it is hard to see how there can be much movement outside pilot projects. Perhaps the ‘Apply AI Strategy’ will provide some developments on those fronts.

Learning digital procurement's hard lessons before jumping in at the AI deep end

January 20, 2025

Hanna Barakat & Cambridge Diversity Fund / Better Images of AI / Turning Threads of Cognition / CC-BY 4.0.

In quick succession after the UK Government published its AI Opportunities Action Plan, the National Audit Office (NAO) released its report ‘Government’s approach to technology suppliers: addressing the challenges’ (the NAO digital procurement report). Reading both documents in relation to each other paints a picture of the difficulties and pitfalls in the acceleration of public sector AI adoption desired by the UK Government.

More generally, I think this reflects the tensions faced in most jurisdictions yet to find ways to adapt their procurement practices and programmes to the digital environment and to ‘data first’ approaches, and how important but expensive interventions in ensuring continued investment in procurement skills and systems can have large knock-on effects on the broader functioning of the public sector for better and for worse (an issue I am researching with Nathan Davies).

In short, the AI Opportunities Action Plan seeks to ‘push hard on cross-economy AI adoption’ and places AI procurement at the forefront of that effort. As I highlighted in my hot take on the plan, one of its main weaknesses is the lack of detail on the measures to be put in place to address the large digital skills gap in the public sector— while the extent to which that gap is reduced will be determinative of how far AI procurement can go in contract design, contract and performance management, and other crucial tasks to deliver the plan (see full comments here).

This built on my earlier research, where I have stressed how a risk-based approach to the design and implementation of AI procurement requires advanced digital skills, and how shortcomings in digital skills compound key risks, such as data governance, technological and operational dependency, and system integrity risks (see here ch 7, and here).

My research, and that of others such as the Ada Lovelace Institute (see here and here), has also stressed how current guidance and best practices are insufficient to support the procurement of AI, and how this compounds the issues arising from shortfalls in digital skills. It is also clear that these issues are bound to especially affect particularly resource-constrained areas of the public administration, and that local authority procurement is in a uniquely challenging situation (which I am researching with colleagues at Careful Trouble).

All this research raises significant questions on the deliverability of plans to accelerate AI adoption in the public sector in ways that align with the public interest and do not generate unacceptable risks of mass harms (see here) and, in my view, advocates for a different approach that focuses on putting regulatory stopgap solutions in place while investment in the required fundamentals (data, skills, processes) is addressed, and provides a source of independent oversight of this high stakes process of public sector digital transformation. There are also environmental and other reasons to favour a ‘frugal AI’ approach (see eg here).

The main issue with such cautious (or I would say, realistic) approaches is that they do not convey a politically popular message, and that they are exposed to criticism for being excessively pessimistic or over-prudent, or/and for slowing down the adoption of AI-based solutions that (with the right technosolutionist lenses on) will unlock massive changes in resource-starved public service delivery.

In my view, the NAO digital procurement report makes for grim reading, but it is a strong endorsement of the need for such alternative, slower approaches.

As summarised in its press release, based on its recent investigations into different aspects of government digital transformation programmes, the NAO extracted the following lessons for the UK government to consider:

At central government level

There are not enough people with digital commercial skills in government.
Government procurement guidance does not address all the complexities of digital commercial issues.
Government struggles with the breadth of issues that affects its ability to engage effectively with suppliers.

At department/ministerial level

Departments do not make full use of their digital expertise when procuring for technology-enabled business change.
Digital contracts are awarded with insufficient preparation.
Approaches to contract design can negatively impact successful digital delivery.

This leads the NAO to formulate related recommendations

‘The NAO is recommending that the centre decides who should take ownership for addressing the problems identified in our report. It should produce a sourcing strategy to include improvements in how it deals with ‘big tech’ and strategic suppliers. It should also create a digital skills plan to plug recruitment shortfalls and to better equip and train decision-makers responsible for digital commercial activities.

For departments, the NAO recommends departments strengthen their ‘intelligent client function’. They need to identify and develop key requirements before tenders and bid processes commence, and improve how policymakers and technical specialists work together with procurement specialists. Departments should also improve their capability to collect and use data to inform a pipeline of supply and demand. This would help the centre of government build a more strategic approach to suppliers.’

In my view, the NAO’s findings and recommendations stress the crucial importance of addressing the public sector digital skills gap (both at central and departmental/contracting authority level), so that shortcomings in procurement guidance and in subsequent procurement planning and design, and contract management, can take place. They also stress the urgency in creating workable sets of guidance that provide much more detail and support than the existing generic documents.

What is worth further highlighting is that, unless and until these issues are addressed, digital procurement cannot be successful and, what is more troublesome, in the current context, an acceleration of AI procurement is a very bad idea because it will aggravate the problems identified in the report and potentially create situations that will be impossible or exceedingly costly to fix later on.

In my view, the NAO report should be a wake up call to the UK Government — and to other governments operating in comparable contexts — to do things more slowly and to find ways to fix technological debt, skills shortcomings, and lock-in and other problems associated with high concentration in digital markets. It is difficult to fix them now, but it will be more difficult to do every year from now. Given the nascent state of AI procurement, it seems to me that there is still a window of opportunity to change tack. I am not optimistic that this will happen, though.

A blog for a book -- Extended for the rest of the summer

July 9, 2024

It was great to receive a submission for the ‘a book for a blog’ competition, which you can read here. However, it seems that the time was not right, as I would have hoped for a few more entries. And I still have two copies of the book to give away. I am thus extending the deadline until 13 September 2024, in case anyone feels inspired over the summer break. Same conditions as in the original call apply.

I take this chance to wish everyone a good summer, and a good break over it. See you in September!

*Original call for blogs*

I am increasingly interested in the ways public procurement can contribute to slowing down public sector AI adoption. I think slowing down AI adoption is very important because we need time to develop a better understanding of the technology and its systemic implications, and to put much-needed effective safeguards in place.

These are themes I explore in my recent book Digital Technologies and Public Procurement. Gatekeeping and Experimentation in Digital Public Governance (OUP, 2024). I am very interested in what others think of this — including why some may think it is a bad idea. So I am launching ‘a blog for a book’ competition.

If you have ideas on how to use procurement to slow down public sector AI adoption, and/or why it is a good or a bad idea, please send me a blog of up to 2,000 words by 30 June 2024 at a.sanchez-graells@bristol.ac.uk. The top 3 blogs will receive a copy of the book. All interesting blogs will be published (with permission, of course). I look forward to reading your ideas!

Creating (positive) friction in AI procurement

June 26, 2024

I had the opportunity to participate in the Inaugural AI Commercial Lifecycle and Procurement Summit 2024 hosted by Curshaw. This was a very interesting ‘unconference’ where participants offered to lead sessions on topics they wanted to talk about. I led a session on ‘Creating friction in AI procurement’.

This was clearly a counterintuitive way of thinking about AI and procurement, given that the ‘big promise’ of AI is that it will reduce friction (eg through automation, and/or delegation of ‘non-value-added’ tasks). Why would I want to create friction in this context?

The first clarification I was thus asked for was whether this was about ‘good friction’ (as opposed to old bad ‘red tape’ kind of friction), which of course it was (?!), and the second, what do I mean by friction.

My recent research on AI procurement (eg here and here for the book-long treatment) has led me to conclude that we need to slow down the process of public sector AI adoption and to create mechanisms that bring back to the table the ‘non-AI’ option and several ‘stop project’ or ‘deal breaker’ trumps to push back against the tidal wave of unavoidability that seems to dominate all discussions on public sector digitalisation. My preferred solution is to do so through a system of permissioning or licencing administered by an independent authority—but I am aware and willing to concede that there is no political will for it. I thus started thinking about second-best approaches to slowing public sector AI procurement. This is how I got to the idea of friction.

By creating friction, I mean the need for a structured decision-making process that allows for collective deliberation within and around the adopting institution, and which is supported by rigorous impact assessments that tease out second and third order implications from AI adoption, as well as thoroughly interrogating first order issues around data quality and governance, technological governance and organisational capability, in particular around risk management and mitigation. This is complementary—but hopefully goes beyond—emerging frameworks to determine organisational ‘risk appetite’ for AI procurement, such as that developed by the AI Procurement Lab and the Centre for Inclusive Change.

The conversations the focus on ‘good friction’ moved in different directions, but there are some takeaways and ideas that stuck with me (or I managed to jot down in my notes while chatting to others), such as (in no particular order of importance or potential):

the potential for ‘AI minimisation’ or ‘non-AI equivalence’ to test the need for (specific) AI solutions—if you can sufficiently approximate, or replicate, the same functional outcome without AI, or with a simpler type of AI, why not do it that way?;
the need for a structured catalogue of solutions (and components of solutions) that are already available (sometimes in open access, where there is lots of duplication) to inform such considerations;
the importance of asking whether procuring AI is driven by considerations such as availability of funding (is this funded if done with AI but not funded, or hard to fund at the same level, if done in other ways?), which can clearly skew decision-making—the importance of considering the effects of ‘digital industrial policy’ on decision-making;
the power (and relevance) of the deceptively simple question ‘is there an interdisciplinary team to be dedicated to this, and exclusively to this’?;
the importance of knowledge and understanding of the tech and its implications from the beginning, and of expertise in the translation of technical and governance requirements into procurement requirements, to avoid ‘games of chance’ whereby the use of ‘trendy terms’ (such as ‘agile’ or ‘responsible’) may or may not lead to the award of the contract to the best-placed and best-fitting (tech) provider;
the possibility to adapt civic monitoring or social witnessing mechanisms used in other contexts, such as large infrastructure projects, to be embedded in contract performance and auditing phases;
the importance of understanding displacement effects and whether deploying a solution (AI or automation, or similar) to deal with a bottleneck will simply displace the issue to another (new) bottleneck somewhere along the process;
the importance of understanding the broader organisational changes required to capture the hoped for (productivity) gains arising from the tech deployment;
the importance of carefully considering and resourcing the much needed engagement of the ‘intelligent person’ that needs to check the design and outputs of the AI, including frontline workers and those at the receiving end of the relevant decisions or processes and the affected communities—the importance of creating meaningful and effective deliberative engagement mechanisms;
relatedly, the need to ensure organisational engagement and alignment at every level and every step of the AI (pre)procurement process (on which I would recommend reading this recent piece by Kawakami and colleagues);
the need to assess the impacts of changes in scale, complexity, and error exposure;
the need to create adequate circuit-breakers throughout the process.

Certainly lots to reflect on and try to embed in future research and outreach efforts. Thanks to all those who participated in the conversation, and to those interested in joining it. A structured way to do so is through this LinkedIn group.

A blog for a book -- How to use procurement to slow down public sector AI adoption? And should we (not)?

May 24, 2024

Meaning, AI, and procurement -- some thoughts

May 8, 2024

©Ausrine Kuze, *Distorted Reality*, 2021.

James McKinney and Volodymyr Tarnay of the Open Contracting Partnership have published ‘A gentle introduction to applying AI in procurement’. It is a very accessible and helpful primer on some of the most salient issues to be considered when exploring the possibility of using AI to extract insights from procurement big data.

The OCP introduction to AI in procurement provides helpful pointers in relation to task identification, method, input, and model selection. I would add that an initial exploration of the possibility to deploy AI also (and perhaps first and foremost) requires careful consideration of the level of precision and the type (and size) of errors that can be tolerated in the specific task, and ways to test and measure it.

One of the crucial and perhaps more difficult to understand issues covered by the introduction is how AI seeks to capture ‘meaning’ in order to extract insights from big data. This is also a controversial issue that keeps coming up in procurement data analysis contexts, and one that triggered some heated debate at the Public Procurement Data Superpowers Conference last week—where, in my view, companies selling procurement insight services were peddling hyped claims (see session on ‘Transparency in public procurement - Data readability’).

In this post, I venture some thoughts on meaning, AI, and public procurement big data. As always, I am very interested in feedback and opportunities for further discussion.

Meaning

Of course, the concept of meaning is complex and open to philosophical, linguistic, and other interpretations. Here I take a relatively pedestrian and pragmatic approach and, following the Cambridge dictionary, consider two ways in which ‘meaning’ is understood in plain English: ‘the meaning of something is what it expresses or represents’, and meaning as ‘importance or value’.

To put it simply, I will argue that AI cannot capture meaning proper. It can carry complex analysis of ‘content in context’, but we should not equate that with meaning. This will be important later on.

AI, meaning, embeddings, and ‘content in context’

The OCP introduction helpfully addresses this issue in relation to an example of ‘sentence similarity’, where the researchers are looking for phrases that are alike in tender notices and predefined green criteria, and therefore want to use AI to compare sentences and assign them a similarity score. Intuitively, ‘meaning’ would be important to the comparison.

The OCP introduction explains that:

Computers don’t understand human language. They need to operate on numbers. We can represent text and other information as numerical values with vector embeddings. A vector is a list of numbers that, in the context of AI, helps us express the meaning of information and its relationship to other information.
Text can be converted into vectors using a model. [A sentence transformer model] converts a sentence into a vector of 384 numbers. For example, the sentence “don’t panic and always carry a towel” becomes the numbers 0.425…, 0.385…, 0.072…, and so on.
These numbers represent the meaning of the sentence.
Let’s compare this sentence to another: “keep calm and never forget your towel” which has the vector (0.434…, 0.264…, 0.123…, …).
One way to determine their similarity score is to use cosine similarity to calculate the distance between the vectors of the two sentences. Put simply, the closer the vectors are, the more alike the sentences are. The result of this calculation will always be a number from -1 (the sentences have opposite meanings) to 1 (same meaning). You could also calculate this using other trigonometric measures such as Euclidean distance.
For our two sentences above, performing this mathematical operation returns a similarity score of 0.869.
Now let’s consider the sentence “do you like cheese?” which has the vector (-0.167…, -0.557…, 0.066…, …). It returns a similarity score of 0.199. Hooray! The computer is correct!
But, this method is not fool-proof. Let’s try another: “do panic and never bring a towel” (0.589…, 0.255…, 0.0884…, …). The similarity score is 0.857. The score is high, because the words are similar… but the logic is opposite!

I think there are two important observations in relation to the use of meaning here (highlighted above).

First, meaning can hardly be captured where sentences with opposite logic are considered very similar. This is because the method described above (vector embedding) does not capture meaning. It captures content (words) in context (around other words).

Second, it is not possible to fully express in numbers what text expresses or represents, or its importance or value. What the vectors capture is the representation or expression of such meaning, the representation of its value and importance through the use of those specific words in the particular order in which they are expresssed. The string of numbers is thus a second-degree representation of the meaning intended by the words; it is a numerical representation of the word representation, not a numerical representation of the meaning.

Unavoidably, there is plenty scope for loss, alteration or even inversion of meaning when it goes through multiple imperfect processes of representation. This means that the more open textured the expression in words and the less contextualised in its presentation, the more difficult it is to achieve good results.

It is important to bear in mind that the current techniques based on this or similar methods, such as those based on large language models, clearly fail on crucial aspects such as their factuality—which ultimately requires checking whether something with a given meaning is true or false.

This is a burgeoning area of technnical research but it seems that even the most accurate models tend to hover around 70% accuracy, save in highly contextual non-ambiguous contexts (see eg D Quelle and A Bovet, ‘The perils and promises of fact-checking with large language models’ (2024) 7 Front. Artif. Intell., Sec. Natural Language Processing). While this is an impressive feature of these tools, it can hardly be acceptable to extrapolate that these tools can be deployed for tasks that require precision and factuality.

Procurement big data and ‘content and context’

In some senses, the application of AI to extract insights from procurement big data is well suited to the fact that, by and large, existing procurement data is very precisely contextualised and increasingly concerns structured content—that is, that most of the procurement data that is (increasingly) available is captured in structured notices and tends to have a narrowly defined and highly contextual purpose.

From that perspective, there is potential to look for implementations of advanced comparisons of ‘content in context’. But this will most likely have a hard boundary where ‘meaning’ needs to be interpreted or analysed, as AI cannot perform that task. At most, it can help gather the information, but it cannot analyse it because it cannot ‘understand’ it.

Policy implications

In my view, the above shows that the possibility of using AI to extract insights from procurement big data needs to be approched with caution. For tasks where a ‘broad brush’ approach will do, these can be helpful tools. They can help mitigate the informational deficit procurement policy and practice tend to encounter. As put in the conference last week, these tools can help get a sense of broad trends or directions, and can thus inform policy and decision-making only in that regard and to that extent. Conversely, AI cannot be used in contexts where precision is important and where errors would affect important rights or interests.

This is important, for example, in relation to the fascination that AI ‘business insights’ seems to be triggering amongst public buyers. One of the issues that kept coming up concerns why contracting authorities cannot benefit from the same advances that are touted as being offered to (private) tenderers. The case at hand was that of identifying ‘business opportunities’.

A number of companies are using AI to support searches for contract notices to highlight potentially interesting tenders to their clients. They offer services such as ‘tender summaries’, whereby the AI creates a one-line summary on the basis of a contract notice or a tender description, and this summary can be automatically translated (eg into English). They also offer search services based on ‘capturing meaning’ from a company’s website and matching it to potentially interesting tender opportunities.

All these services, however, are at bottom a sophisticated comparison of content in context, not of meaning. And these are deployed to go from more to less information (summaries), which can reduce problems with factuality and precision except in extreme cases, and in a setting where getting it wrong has only a marginal cost (ie the company will set aside the non-interesting tender and move on). This is also an area where expectations can be managed and where results well below 100% accuracy can be interesting and have value.

The opposite does not apply from the perspective of the public buyer. For example, a summary of a tender is unlikely to have much value as, with all likelihood, the summary will simply confirm that the tender matches the advertised object of the contract (which has no value, differently from a summary suggesting a tender matches the business activities of an economic operator). Moreover, factuality is extremely important and only 100% accuracy will do in a context where decision-making is subject to good administration guarantees.

Therefore, we need to be very careful about how we think about using AI to extract insights from procurement (big) data and, as the OCP introduction highlights, one of the most important things is to clearly define the task for which AI would be used. In my view, there are much more limited tasks than one could dream up if we let our collective imagination run high on hype.

Responsibly Buying Artificial Intelligence: A ‘Regulatory Hallucination’ -- draft paper for comment

November 24, 2023

Following yesterday’s Current Legal Problems Lecture, I have uploaded the current full draft of the paper on SSRN. I would be very grateful for any comments in the next few weeks, as I plan to do a final revision and to submit it for peer-review in early 2024. Thanks in advance for those who take the time. As always, you can reach me at a.sanchez-graells@bristol.ac.uk.

The abstract of the paper is as follows:

Here, I focus on the UK’s approach to regulating public sector procurement and use of artificial intelligence (AI) in the context of the broader ‘pro-innovation’ approach to AI regulation. Borrowing from the description of AI ‘hallucinations’ as plausible but incorrect answers given with high confidence by AI systems, I argue that UK policymaking is trapped in a ‘regulatory hallucination.’ Despite having embraced the plausible ‘pro-innovation’ regulatory approach with high confidence, that is the incorrect answer to the challenge of regulating AI procurement and use by the public sector. I conceptualise the current strategy as one of ‘regulation by contract’ and identify two of its underpinning presumptions that make its deployment in the digital context particularly challenging. I show how neither the presumption of superiority of the public buyer over the public contractor, nor the related presumption that the public buyer is the rule-maker and the public contractor is the rule-taker, necessarily hold in this context. Public buyer superiority is undermined by the two-sided gatekeeping required to simultaneously discipline the behaviour of the public sector AI user and the tech provider. The public buyer’s rule-making role is also undermined by its reliance on industry-led standards, as well as by the tech provider’s upper hand in setting contractual benchmarks and controlling the ensuing self-assessments. In view of the ineffectiveness of regulating public sector AI use by contract, I then sketch an alternative strategy to boost the effectiveness of the goals of AI regulation and the protection of individual rights and collective interests through the creation of an independent authority.

Sanchez-Graells, Albert, ‘Responsibly Buying Artificial Intelligence: A “Regulatory Hallucination”’ (November 24, 2023). Current Legal Problems 2023-24, Available at SSRN: https://ssrn.com/abstract=4643273.

Responsibly Buying Artificial Intelligence: A Regulatory Hallucination?

November 21, 2023

I look forward to delivering the lecture ‘Responsibly Buying Artificial Intelligence: A Regulatory Hallucination?’ as part of the Current Legal Problems Lecture Series 2023-24 organised by UCL Laws. The lecture will be this Thursday 23 November 2023 at 6pm GMT and you can still register to participate (either online or in person). These are the slides I will be using, in case you want to take a sneak peek. I will post a draft version of the paper after the lecture. Comments welcome!

Some thoughts on the US' Executive Order on the Safe, Secure, and Trustworthy Development and Use of AI

November 7, 2023

On 30 October 2023, President Biden adopted the Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence (the ‘AI Executive Order’, see also its Factsheet). The use of AI by the US Federal Government is an important focus of the AI Executive Order. It will be subject to a new governance regime detailed in the Draft Policy on the use of AI in the Federal Government (the ‘Draft AI in Government Policy’, see also its Factsheet), which is open for comment until 5 December 2023. Here, I reflect on these documents from the perspective of AI procurement as a major plank of this governance reform.

Procurement in the AI Executive Order

Section 2 of the AI Executive Order formulates eight guiding principles and priorities in advancing and governing the development and use of AI. Section 2(g) refers to AI risk management, and states that

It is important to manage the risks from the Federal Government’s own use of AI and increase its internal capacity to regulate, govern, and support responsible use of AI to deliver better results for Americans. These efforts start with people, our Nation’s greatest asset. My Administration will take steps to attract, retain, and develop public service-oriented AI professionals, including from underserved communities, across disciplines — including technology, policy, managerial, procurement, regulatory, ethical, governance, and legal fields — and ease AI professionals’ path into the Federal Government to help harness and govern AI. The Federal Government will work to ensure that all members of its workforce receive adequate training to understand the benefits, risks, and limitations of AI for their job functions, and to modernize Federal Government information technology infrastructure, remove bureaucratic obstacles, and ensure that safe and rights-respecting AI is adopted, deployed, and used.

Section 10 then establishes specific measures to advance Federal Government use of AI. Section 10.1(b) details a set of governance reforms to be implemented in view of the Director of the Office of Management and Budget (OMB)’s guidance to strengthen the effective and appropriate use of AI, advance AI innovation, and manage risks from AI in the Federal Government. Section 10.1(b) includes the following (emphases added):

The Director of OMB’s guidance shall specify, to the extent appropriate and consistent with applicable law:

(i) the requirement to designate at each agency within 60 days of the issuance of the guidance a Chief Artificial Intelligence Officer who shall hold primary responsibility in their agency, in coordination with other responsible officials, for coordinating their agency’s use of AI, promoting AI innovation in their agency, managing risks from their agency’s use of AI …;

(ii) the Chief Artificial Intelligence Officers’ roles, responsibilities, seniority, position, and reporting structures;

(iii) for [covered] agencies […], the creation of internal Artificial Intelligence Governance Boards, or other appropriate mechanisms, at each agency within 60 days of the issuance of the guidance to coordinate and govern AI issues through relevant senior leaders from across the agency;

(iv) required minimum risk-management practices for Government uses of AI that impact people’s rights or safety, including, where appropriate, the following practices derived from OSTP’s Blueprint for an AI Bill of Rights and the NIST AI Risk Management Framework: conducting public consultation; assessing data quality; assessing and mitigating disparate impacts and algorithmic discrimination; providing notice of the use of AI; continuously monitoring and evaluating deployed AI; and granting human consideration and remedies for adverse decisions made using AI;

(v) specific Federal Government uses of AI that are presumed by default to impact rights or safety;

(vi) recommendations to agencies to reduce barriers to the responsible use of AI, including barriers related to information technology infrastructure, data, workforce, budgetary restrictions, and cybersecurity processes;

(vii) requirements that [covered] agencies […] develop AI strategies and pursue high-impact AI use cases;

(viii) in consultation with the Secretary of Commerce, the Secretary of Homeland Security, and the heads of other appropriate agencies as determined by the Director of OMB, recommendations to agencies regarding:

(A) external testing for AI, including AI red-teaming for generative AI, to be developed in coordination with the Cybersecurity and Infrastructure Security Agency;

(B) testing and safeguards against discriminatory, misleading, inflammatory, unsafe, or deceptive outputs, as well as against producing child sexual abuse material and against producing non-consensual intimate imagery of real individuals (including intimate digital depictions of the body or body parts of an identifiable individual), for generative AI;

(D) application of the mandatory minimum risk-management practices defined under subsection 10.1(b)(iv) of this section to procured AI;

(E) independent evaluation of vendors’ claims concerning both the effectiveness and risk mitigation of their AI offerings;

(F) documentation and oversight of procured AI;

(G) maximizing the value to agencies when relying on contractors to use and enrich Federal Government data for the purposes of AI development and operation;

(H) provision of incentives for the continuous improvement of procured AI; and

(I) training on AI in accordance with the principles set out in this order and in other references related to AI listed herein; and

(ix) requirements for public reporting on compliance with this guidance.

Section 10.1(b) of the AI Executive Order establishes two sets or types of requirements.

First, there are internal governance requirements and these revolve around the appointment of Chief Artificial Intelligence Officers (CAIOs), AI Governance Boards, their roles, and support structures. This set of requirements seeks to strengthen the ability of Federal Agencies to understand AI and to provide effective safeguards in its governmental use. The crucial set of substantive protections from this internal perspective derives from the required minimum risk-management practices for Government uses of AI, which is directly placed under the responsibility of the relevant CAIO.

Second, there are external (or relational) governance requirements that revolve around the agency’s ability to control and challenge tech providers. This involves the transfer (back to back) of minimum risk-management practices to AI contractors, but also includes commercial considerations. The tone of the Executive Order indicates that this set of requirements is meant to neutralise risks of commercial capture and commercial determination by imposing oversight and external verification. From an AI procurement governance perspective, the requirements in Section 10.1(b)(viii) are particularly relevant. As some of those requirements will need further development with a view to their operationalisation, Section 10.1(d)(ii) of the AI Executive Order requires the Director of OMB to develop an initial means to ensure that agency contracts for the acquisition of AI systems and services align with its Section 10.1(b) guidance.

Procurement in the Draft AI in Government Policy

The guidance required by Section 10.1(b) of the AI Executive Order has been formulated in the Draft AI in Government Policy, which offers more detail on the relevant governance mechanisms and the requirements for AI procurement. Section 5 on managing risks from the use of AI is particularly relevant from an AI procurement perspective. While Section 5(d) refers explicitly to managing risks in AI procurement, given that the primary substantive obligations will arise from the need to comply with the required minimum risk-management practices for Government uses of AI, this specific guidance needs to be read in the broader context of AI risk-management within Section 5 of the Draft AI in Government Policy.

Scope

The Draft AI in Government Policy relies on a tiered approach to AI risk by imposing specific obligations in relation to safety-impacting and rights-impacting AI only. This is an important element of the policy because these two categories are defined (in Section 6) and in principle will cover pre-established lists of AI use, based on a set of presumptions (Section 5(b)(i) and (ii)). However, CAIOs will be able to waive the application of minimum requirements for specific AI uses where, ‘based upon a system-specific risk assessment, [it is shown] that fulfilling the requirement would increase risks to safety or rights overall or would create an unacceptable impediment to critical agency operations‘ (Section 5(c)(iii)). Therefore, these are not closed lists and the specific scope of coverage of the policy will vary with such determinations. There are also some exclusions from minimum requirements where the AI is used for narrow purposes (Section 5(c)(i))—notably the ‘Evaluation of a potential vendor, commercial capability, or freely available AI capability that is not otherwise used in agency operations, solely for the purpose of making a procurement or acquisition decision’; AI evaluation in the context of regulatory enforcement, law enforcement or national security action; or research and development.

This scope of the policy may be under-inclusive, or generate risks of under-inclusiveness at the boundary, in two respects. First, the way AI is defined for the purposes of the Draft AI in Government Policy, excludes ‘robotic process automation or other systems whose behavior is defined only by human-defined rules or that learn solely by repeating an observed practice exactly as it was conducted’ (Section 6). This could be under-inclusive to the extent that the minimum risk-management practices for Government uses of AI create requirements that are not otherwise applicable to Government use of (non-AI) algorithms. There is a commonality of risks (eg discrimination, data governance risks) that would be better managed if there was a joined up approach. Moreover, developing minimum practices in relation to those means of automation would serve to develop institutional capability that could then support the adoption of AI as defined in the policy. Second, the variability in coverage stemming from consideration of ‘unacceptable impediments to critical agency operations‘ opens the door to potentially problematic waivers. While these are subject to disclosure and notification to OMB, it is not entirely clear on what grounds OMB could challenge those waivers. This is thus an area where the guidance may require further development.

extensions and waivers

In relation to covered safety-impacting or rights-impacting AI (as above), Section 5(a)(i) establishes the important principle that US Federal Government agencies have until 1 August 2024 to implement the minimum practices in Section 5(c), ‘or else stop using any AI that is not compliant with the minimum practices’. This type of sunset clause concerning the currently implicit authorisation for the use of AI is a potentially powerful mechanism. However, the Draft also establishes that such obligation to discontinue non-compliant AI use must be ‘consistent with the details and caveats in that section [5(c)]’, which includes the possibility, until 1 August 2024, for agencies to

request from OMB an extension of limited and defined duration for a particular use of AI that cannot feasibly meet the minimum requirements in this section by that date. The request must be accompanied by a detailed justification for why the agency cannot achieve compliance for the use case in question and what practices the agency has in place to mitigate the risks from noncompliance, as well as a plan for how the agency will come to implement the full set of required minimum practices from this section.

Again, the guidance does not detail on what grounds OMB would grant those extensions or how long they would be for. There is a clear interaction between the extension and waiver mechanism. For example, an agency that saw its request for an extension declined could try to waive that particular AI use—or agencies could simply try to waive AI uses rather than applying for extensions, as the requirements for a waiver seem to be rather different (and potentially less demanding) than those applicable to a waiver. In that regard, it seems that waiver determinations are ‘all or nothing’, whereas the system could be more flexible (and protective) if waiver decisions not only needed to explain why meeting the minimum requirements would generate the heightened overall risks or pose such ‘unacceptable impediments to critical agency operations‘, but also had to meet the lower burden of mitigation currently expected in extension applications, concerning detailed justification for what practices the agency has in place to mitigate the risks from noncompliance where they can be partly mitigated. In other words, it would be preferable to have a more continuous spectrum of mitigation measures in the context of waivers as well.

general minimum practices

Both in relation to safety- and rights-impact AI uses, the Draft AI in Government Policy would require agencies to engage in risk management both before and while using AI.

Preventative measures include:

completing an AI Impact Assessment documenting the intended purpose of the AI and its expected benefit, the potential risks of using AI, and and analysis of the quality and appropriateness of the relevant data;
testing the AI for performance in a real-world context—that is, testing under conditions that ‘mirror as closely as possible the conditions in which the AI will be deployed’; and
independently evaluate the AI, with the particularly important requirement that ‘The independent reviewing authority must not have been directly involved in the system’s development.’ In my view, it would also be important for the independent reviewing authority not to be involved in the future use of the AI, as its (future) operational interest could also be a source of bias in the testing process and the analysis of its results.

In-use measures include:

conducting ongoing monitoring and establish thresholds for periodic human review, with a focus on monitoring ‘degradation to the AI’s functionality and to detect changes in the AI’s impact on rights or safety’—‘human review, including renewed testing for performance of the AI in a real-world context, must be conducted at least annually, and after significant modifications to the AI or to the conditions or context in which the AI is used’;
mitigating emerging risks to rights and safety—crucially, ‘Where the AI’s risks to rights or safety exceed an acceptable level and where mitigation is not practicable, agencies must stop using the affected AI as soon as is practicable’. In that regard, the draft indicates that ‘Agencies are responsible for determining how to safely decommission AI that was already in use at the time of this memorandum’s release without significant disruptions to essential government functions’, but it would seem that this is also a process that would benefit from close oversight by OMB as it would otherwise jeopardise the effectiveness of the extension and waiver mechanisms discussed above—in which case additional detail in the guidance would be required;
ensuring adequate human training and assessment;
providing appropriate human consideration as part of decisions that pose a high risk to rights or safety; and
providing public notice and plain-language documentation through the AI use case inventory—however, this is subject a large number of caveats (notice must be ‘consistent with applicable law and governmentwide guidance, including those concerning protection of privacy and of sensitive law enforcement, national security, and other protected information’) and more detailed guidance on how to assess these issues would be welcome (if it exists, a cross-reference in the draft policy would be helpful).

additional minimum practices for rights-impacting ai

In relation to rights-affecting AI only, the Draft AI in Government Policy would require agencies to take additional measures.

Preventative measures include:

take steps to ensure that the AI will advance equity, dignity, and fairness—including proactively identifying and removing factors contributing to algorithmic discrimination or bias; assessing and mitigating disparate impacts; and using representative data; and
consult and incorporate feedback from affected groups.

In-use measures include:

conducting ongoing monitoring and mitigation for AI-enabled discrimination;
notifying negatively affected individuals—this is an area where the draft guidance is rather woolly, as it also includes a set of complex caveats, as individual notice that ‘AI meaningfully influences the outcome of decisions specifically concerning them, such as the denial of benefits’ must only be given ‘[w]here practicable and consistent with applicable law and governmentwide guidance’. Moreover, the draft only indicates that ‘Agencies are also strongly encouraged to provide explanations for such decisions and actions’, but not required to. In my view, this tackles two of the most important implications for individuals in Government use of AI: the possibility to understand why decisions are made (reason giving duties) and the burden of challenging automated decisions, which is increased if there is a lack of transparency on the automation. Therefore, on this point, the guidance seems too tepid—especially bearing in mind that this requirement only applies to ‘AI whose output serves as a basis for decision or action that has a legal, material, or similarly significant effect on an individual’s’ civil rights, civil liberties, or privacy; equal opportunities; or access to critical resources or services. In these cases, it seems clear that notice and explainability requirements need to go further.
maintaining human consideration and remedy processes—including ‘potential remedy to the use of the AI by a fallback and escalation system in the event that an impacted individual would like to appeal or contest the AI’s negative impacts on them. In developing appropriate remedies, agencies should follow OMB guidance on calculating administrative burden and the remedy process should not place unnecessary burden on the impacted individual. When law or governmentwide guidance precludes disclosure of the use of AI or an opportunity for an individual appeal, agencies must create appropriate mechanisms for human oversight of rights-impacting AI’. This is another crucial area concerning rights not to be subjected to fully-automated decision-making where there is no meaningful remedy. This is also an area of the guidance that requires more detail, especially as to what is the adequate balance of burdens where eg the agency can automate the undoing of negative effects on individuals identified as a result of challenges by other individuals or in the context of the broader monitoring of the functioning and effects of the rights-impacting AI. In my view, this would be an opportunity to mandate automation of remediation in a meaningful way.
maintaining options to opt-out where practicable.

procurement related practices

In addition to the need for agencies to be able to meet the above requirements in relation to procured AI—which will in itself create the need to cascade some of the requirements down to contractors, and which will be the object of future guidance on how to ensure that AI contracts align with the requirements—the Draft AI in Government Policy also requires that agencies procuring AI manage risks by:

aligning to National Values and Law by ensuring ‘that procured AI exhibits due respect for our Nation’s values, is consistent with the Constitution, and complies with all other applicable laws, regulations, and policies, including those addressing privacy, confidentiality, copyright, human and civil rights, and civil liberties’;
taking ‘steps to ensure transparency and adequate performance for their procured AI, including by: obtaining adequate documentation of procured AI, such as through the use of model, data, and system cards; regularly evaluating AI-performance claims made by Federal contractors, including in the particular environment where the agency expects to deploy the capability; and considering contracting provisions that incentivize the continuous improvement of procured AI’;
taking ‘appropriate steps to ensure that Federal AI procurement practices promote opportunities for competition among contractors and do not improperly entrench incumbents. Such steps may include promoting interoperability and ensuring that vendors do not inappropriately favor their own products at the expense of competitors’ offering’;
maximizing the value of data for AI; and
responsibly procuring Generative AI.

These high level requirements are well targeted and compliance with them would go a long way to fostering ‘responsible AI procurement’ through adequate risk mitigation in ways that still allow the procurement mechanism to harness market forces to generate value for money.

However, operationalising these requirements will be complex and the further OMB guidance should be rather detailed and practical.

Final thoughts

In my view, the AI Executive Order and the Draft AI in Government Policy lay the foundations for a significant strengthening of the governance of AI procurement with a view to embedding safeguards in public sector AI use. A crucially important characteristic in the design of these governance mechanisms is that it imposes significant duties on the agencies seeking to procure and use the AI, and it explicitly seeks to address risks of commercial capture and commercial determination. Another crucially important characteristic is that, at least in principle, use of AI is made conditional on compliance with a rather comprehensive set of preventative and in-use risk mitigation measures. The general aspects of this governance approach thus offer a very valuable blueprint for other jurisdictions considering how to boost AI procurement governance.

However, as always, the devil is in the details. One of the crucial risks in this approach to AI governance concerns a lack of independence of the entities making the relevant assessments. In the Draft AI in Government Policy, there are some risks of under-inclusion and/or excessive waivers of compliance with the relevant requirements (both explicit and implicit, through protracted processes of decommissioning of non-compliant AI), as well as a risk that ‘practical considerations’ will push compliance with the risk mitigation requirements well past the (ambitious) 1 August 2024 deadline through long or rolling extensions.

To mitigate for this, the guidance should be much clearer on the role of OMB in extension, waiver and decommissioning decisions, as well as in relation to the specific criteria and limits that should form part of those decisions. Only by ensuring adequate OMB intervention can a system of governance that still does not entirely (organisationally) separate procurement, use and oversight decisions reach the levels of independent verification required not only to neutralise commercial determination, but also operational dependency and the ‘policy irresistibility’ of digital technologies.

Thoughts on the AI Safety Summit from a public sector procurement & use of AI perspective

November 3, 2023

The UK Government hosted an AI Safety Summit on 1-2 November 2023. A summary of the targeted discussions in a set of 8 roundtables has been published for Day 1, as well as a set of Chair’s statements for Day 2, including considerations around safety testing, the state of the science, and a general summary of discussions. There is also, of course, the (flagship?) Bletchley Declaration, and an introduction to the announced AI Safety Institute (UK AISI).

In this post, I collect some of my thoughts on these outputs of the AI Safety Summit from the perspective of public sector procurement and use of AI.

What was said at the AI safety Summit?

Although the summit was narrowly targeted to discussion of ‘frontier AI’ as particularly advanced AI systems, some of the discussions seem to have involved issues also applicable to less advanced (ie currently in existence) AI systems, and even to non-AI algorithms used by the public sector. As the general summary reflects, ‘There was also substantive discussion of the impact of AI upon wider societal issues, and suggestions that such risks may themselves pose an urgent threat to democracy, human rights, and equality. Participants expressed a range of views as to which risks should be prioritised, noting that addressing frontier risks is not mutually exclusive from addressing existing AI risks and harms.’ Crucially, ‘participants across both days noted a range of current AI risks and harmful impacts, and reiterated the need for them to be tackled with the same energy, cross-disciplinary expertise, and urgency as risks at the frontier.’ Hopefully, then, some of the rather far-fetched discussions of future existential risks can be conducive to taking action on current harms and risks arising from the procurement and use of less advanced systems.

There seemed to be some recognition of the need for more State intervention through regulation, for more regulatory control of standard-setting, and for more attention to be paid to testing and evaluation in the procurement context. For example, the summary of Day 1 discussions indicates that participants agreed that

‘We should invest in basic research, including in governments’ own systems. Public procurement is an opportunity to put into practice how we will evaluate and use technology.’ (Roundtable 4)
‘Company policies are just the baseline and don’t replace the need for governments to set standards and regulate. In particular, standardised benchmarks will be required from trusted external third parties such as the recently announced UK and US AI Safety Institutes.’ (Roundtable 5)

In Day 2, in the context of safety testing, participants agreed that

Governments have a responsibility for the overall framework for AI in their countries, including in relation to standard setting. Governments recognise their increasing role for seeing that external evaluations are undertaken for frontier AI models developed within their countries in accordance with their locally applicable legal frameworks, working in collaboration with other governments with aligned interests and relevant capabilities as appropriate, and taking into account, where possible, any established international standards.
Governments plan, depending on their circumstances, to invest in public sector capability for testing and other safety research, including advancing the science of evaluating frontier AI models, and to work in partnership with the private sector and other relevant sectors, and other governments as appropriate to this end.
Governments will plan to collaborate with one another and promote consistent approaches in this effort, and to share the outcomes of these evaluations, where sharing can be done safely, securely and appropriately, with other countries where the frontier AI model will be deployed.

This could be a basis on which to build an international consensus on the need for more robust and decisive regulation of AI development and testing, as well as a consensus of the sets of considerations and constraints that should be applicable to the procurement and use of AI by the public sector in a way that is compliant with individual (human) rights and social interests. The general summary reflects that ‘Participants welcomed the exchange of ideas and evidence on current and upcoming initiatives, including individual countries’ efforts to utilise AI in public service delivery and elsewhere to improve human wellbeing. They also affirmed the need for the benefits of AI to be made widely available’.

However, some statements seem at first sight contradictory or problematic. While the excerpt above stresses that ‘Governments have a responsibility for the overall framework for AI in their countries, including in relation to standard setting’ (emphasis added), the general summary also stresses that ‘The UK and others recognised the importance of a global digital standards ecosystem which is open, transparent, multi-stakeholder and consensus-based and many standards bodies were noted, including the International Standards Organisation (ISO), International Electrotechnical Commission (IEC), Institute of Electrical and Electronics Engineers (IEEE) and relevant study groups of the International Telecommunication Union (ITU).’ Quite how State responsibility for standard setting fits with industry-led standard setting by such organisations is not only difficult to fathom, but also one of the potentially most problematic issues due to the risk of regulatory tunnelling that delegation of standard setting without a verification or certification mechanism entails.

Moreover, there seemed to be insufficient agreement around crucial issues, which are summarised as ‘a set of more ambitious policies to be returned to in future sessions’, including:

‘1. Multiple participants suggested that existing voluntary commitments would need to be put on a legal or regulatory footing in due course. There was agreement about the need to set common international standards for safety, which should be scientifically measurable.

2. It was suggested that there might be certain circumstances in which governments should apply the principle that models must be proven to be safe before they are deployed, with a presumption that they are otherwise dangerous. This principle could be applied to the current generation of models, or applied when certain capability thresholds were met. This would create certain ‘gates’ that a model had to pass through before it could be deployed.

3. It was suggested that governments should have a role in testing models not just pre- and post-deployment, but earlier in the lifecycle of the model, including early in training runs. There was a discussion about the ability of governments and companies to develop new tools to forecast the capabilities of models before they are trained.

4. The approach to safety should also consider the propensity for accidents and mistakes; governments could set standards relating to how often the machine could be allowed to fail or surprise, measured in an observable and reproducible way.

5. There was a discussion about the need for safety testing not just in the development of models, but in their deployment, since some risks would be contextual. For example, any AI used in critical infrastructure, or equivalent use cases, should have an infallible off-switch.

…

8. Finally, the participants also discussed the question of equity, and the need to make sure that the broadest spectrum was able to benefit from AI and was shielded from its harms.’

All of these are crucial considerations in relation to the regulation of AI development, (procurement) and use. A lack of consensus around these issues already indicates that there was a generic agreement that some regulation is necessary, but much more limited agreement on what regulation is necessary. This is clearly reflected in what was actually agreed at the summit.

What was agreed at the AI Safety Summit?

Despite all the discussions, little was actually agreed at the AI Safety Summit. The Blethcley Declaration includes a lengthy (but rather uncontroversial?) description of the potential benefits and actual risks of (frontier) AI, some rather generic agreement that ‘something needs to be done’ (eg welcoming ‘the recognition that the protection of human rights, transparency and explainability, fairness, accountability, regulation, safety, appropriate human oversight, ethics, bias mitigation, privacy and data protection needs to be addressed’) and very limited and unspecific commitments.

Indeed, signatories only ‘committed’ to a joint agenda, comprising:

‘identifying AI safety risks of shared concern, building a shared scientific and evidence-based understanding of these risks, and sustaining that understanding as capabilities continue to increase, in the context of a wider global approach to understanding the impact of AI in our societies.
building respective risk-based policies across our countries to ensure safety in light of such risks, collaborating as appropriate while recognising our approaches may differ based on national circumstances and applicable legal frameworks. This includes, alongside increased transparency by private actors developing frontier AI capabilities, appropriate evaluation metrics, tools for safety testing, and developing relevant public sector capability and scientific research’ (emphases added).

This does not amount to much that would not happen anyway and, given that one of the UK Government’s objectives for the Summit was to create mechanisms for global collaboration (‘a forward process for international collaboration on frontier AI safety, including how best to support national and international frameworks’), this agreement for each jurisdiction to do things as they see fit in accordance to their own circumstances and collaborate ‘as appropriate’ in view of those seems like a very poor ‘win’.

In reality, there seems to be little coming out of the Summit other than a plan to continue the conversations in 2024. Given what had been said in one of the roundtables (num 5) in relation to the need to put in place adequate safeguards: ‘this work is urgent, and must be put in place in months, not years’; it looks like the ‘to be continued’ approach won’t do or, at least, cannot be claimed to have made much of a difference.

What did the UK Government promise in the AI Summit?

A more specific development announced with the occasion of the Summit (and overshadowed by the earlier US announcement) is that the UK will create the AI Safety Institute (UK AISI), a ‘state-backed organisation focused on advanced AI safety for the public interest. Its mission is to minimise surprise to the UK and humanity from rapid and unexpected advances in AI. It will work towards this by developing the sociotechnical infrastructure needed to understand the risks of advanced AI and enable its governance.’

Crucially, ‘The Institute will focus on the most advanced current AI capabilities and any future developments, aiming to ensure that the UK and the world are not caught off guard by progress at the frontier of AI in a field that is highly uncertain. It will consider open-source systems as well as those deployed with various forms of access controls. Both AI safety and security are in scope’ (emphasis added). This seems to carry forward the extremely narrow focus on ‘frontier AI’ and catastrophic risks that augured a failure of the Summit. It is also in clear contrast with the much more sensible and repeated assertions/consensus in that other types of AI cause very significant risks and that there is ‘a range of current AI risks and harmful impacts, and reiterated the need for them to be tackled with the same energy, cross-disciplinary expertise, and urgency as risks at the frontier.’

Also crucially, UK AISI ‘is not a regulator and will not determine government regulation. It will collaborate with existing organisations within government, academia, civil society, and the private sector to avoid duplication, ensuring that activity is both informing and complementing the UK’s regulatory approach to AI as set out in the AI Regulation white paper’.

According to initial plans, UK AISI ‘will initially perform 3 core functions:

Develop and conduct evaluations on advanced AI systems, aiming to characterise safety-relevant capabilities, understand the safety and security of systems, and assess their societal impacts
Drive foundational AI safety research, including through launching a range of exploratory research projects and convening external researchers
Facilitate information exchange, including by establishing – on a voluntary basis and subject to existing privacy and data regulation – clear information-sharing channels between the Institute and other national and international actors, such as policymakers, international partners, private companies, academia, civil society, and the broader public’

It is also stated that ‘We see a key role for government in providing external evaluations independent of commercial pressures and supporting greater standardisation and promotion of best practice in evaluation more broadly.’ However, the extent to which UK AISI will be able to do that will hinge on issues that are not currently clear (or publicly disclosed), such as the membership of UK AISI or its institutional set up (as ‘state-backed organisation’ does not say much about this).

On that very point, it is somewhat problematic that the UK AISI ‘is an evolution of the UK’s Frontier AI Taskforce. The Frontier AI Taskforce was announced by the Prime Minister and Technology Secretary in April 2023’ (ahem, as ‘Foundation Model Taskforce’—so this is the second rebranding of the same initiative in half a year). As is problematic that UK AISI ‘will continue the Taskforce’s safety research and evaluations. The other core parts of the Taskforce’s mission will remain in [the Department for Science, Innovation and Technology] as policy functions: identifying new uses for AI in the public sector; and strengthening the UK’s capabilities in AI.’ I find the retention of analysis pertaining to public sector AI use within government problematic and a clear indication of the UK’s Government unwillingness to put meaningful mechanisms in place to monitor the process of public sector digitalisation. UK AISI very much sounds like a research institute with a focus on a very narrow set of AI systems and with a remit that will hardly translate into relevant policymaking in areas in dire need of regulation. Finally, it is also very problematic that funding is not locked: ‘The Institute will be backed with a continuation of the Taskforce’s 2024 to 2025 funding as an annual amount for the rest of this decade, subject to it demonstrating the continued requirement for that level of public funds.’ In reality, this means that the Institute’s continued existence will depend on the Government’s satisfaction with its work and the direction of travel of its activities and outputs. This is not at all conducive to independence, in my view.

So, all in all, there is very little new in the announcement of the creation of the UK AISI and, while there is a (theoretical) possibility for the Institute to make a positive contribution to regulating AI procurement and use (in the public sector), this seems extremely remote and potentially undermined by the Institute’s institutional set up. This is probably in stark contrast with the US approach the UK is trying to mimic (though more on the US approach in a future entry).

European Commission wants to see more AI procurement. Ok, but priorities need reordering

November 3, 2023

The European Commission recently published its 2023 State of the Digital Decade report. One of its key takeaways is that the Commission recommends Member States to step up innovation procurement investments in digital sector.

The Commission has identified that ‘While the roll-out of digital public services is progressing steadily, investment in public procurement of innovative digital solutions (e.g. based on AI or big data) is insufficient and would need to increase substantially from EUR 188 billon to EUR 295 billon in order to reach full speed adoption of innovative digital solutions in public services’ (para 4.2, original emphasis).

The Commission has thus recommended that ‘Member States should step up investment and regulatory measures to develop and make available secure, sovereign and interoperable digital solutions for online public and government services’; and that ‘Member States should develop action plans in support of innovation procurement and step up efforts to increase public procurement investments in developing, testing and deploying innovative digital solutions’.

Tucked away in a different part of the report (which, frankly, has a rather odd structure), the Commission also recommends that ‘Member States should foster the availability of legal and technical support to procure and implement trustworthy and sovereign AI solutions across sectors.’

To my mind, the priorities for investment of public money need to be further clarified. Without a significant investment in an ambitious plan to quickly expand the public sector’s digital skills and capabilities, there can be no hope that increased procurement expenditure in digital technologies will bring adequate public sector digitalisation or foster the public interest more broadly.

Without a sophisticated public buyer that can adequately cut through the process of technological innovation, there is no hope that ‘throwing money at the problem’ will bring meaningful change. In my view, the focus and priority should be on upskilling the public sector before anything else—including ahead of the also recommended mobilisation of ‘public policies, including innovative procurement to foster the scaling up of start-ups, to facilitate the creation of spinoffs from universities and research centres, and to monitor progress in this area’ (para 3.2.3). Perhaps a substantial fraction of the 100+ billion EUR the Commission expects Member States to put into public sector digitalisation could go to building up the required capability… too much to ask?

G7 Guiding Principles and Code of Conduct on Artificial Intelligence -- some comments from a UK perspective

October 31, 2023

On 30 October 2023, G7 leaders published the Hiroshima Process International Guiding Principles for Advanced AI system (the G7 AI Principles), a non-exhaustive list of guiding principles formulated as a living document that builds on the OECD AI Principles to take account of recent developments in advanced AI systems. The G7 stresses that these principles should apply to all AI actors, when and as applicable to cover the design, development, deployment and use of advanced AI systems.

The G7 AI Principles are supported by a voluntary Code of Conduct for Advanced AI Systems (the G7 AI Code of Conduct), which is meant to provide guidance to help seize the benefits and address the risks and challenges brought by these technologies.

The G7 AI Principles and Code of Conduct came just two days before the start of the UK’s AI Safety Summit 2023. Given that the UK is part of the G7 and has endorsed the G7 Hiroshima Process and its outcomes, the interaction between the G7’s documents, the UK Government’s March 2023 ‘pro-innovation’ approach to AI and its aspirations for the AI Safety Summit deserves some comment.

G7 AI Principles and Code of Conduct

The G7 AI Principles aim ‘to promote safe, secure, and trustworthy AI worldwide and will provide guidance for organizations developing and using the most advanced AI systems, including the most advanced foundation models and generative AI systems.’ The principles are meant to be cross-cutting, as they target ‘among others, entities from academia, civil society, the private sector, and the public sector.’ Importantly, also, the G7 AI Principles are meant to be a stop gap solution, as G7 leaders ‘call on organizations in consultation with other relevant stakeholders to follow these [principles], in line with a risk-based approach, while governments develop more enduring and/or detailed governance and regulatory approaches.’

The principles include the reminder that ‘[w]hile harnessing the opportunities of innovation, organizations should respect the rule of law, human rights, due process, diversity, fairness and non-discrimination, democracy, and human-centricity, in the design, development and deployment of advanced AI system’, as well as a reminder that organizations developing and deploying AI should not undermine democratic values, harm individuals or communities, ‘facilitate terrorism, enable criminal misuse, or pose substantial risks to safety, security, and human rights’. State (AI users) are reminder of their ‘obligations under international human rights law to promote that human rights are fully respected and protected’ and private sector actors are called to align their activities ‘with international frameworks such as the United Nations Guiding Principles on Business and Human Rights and the OECD Guidelines for Multinational Enterprises’.

These are all very high level declarations and aspirations that do not go much beyond pre-existing commitments and (soft) law norms, if at all.

The G7 AI Principles comprises a non-exhaustive list of 11 high-level regulatory goals that organizations should abide by ‘commensurate to the risks’—ie following the already mentioned risk-based approach—which introduces a first element of uncertainty because the document does not establish any methodology or explanation on how risks should be assessed and tiered (one of the primary, and debated, features of the proposed EU AI Act). The principles are the following, prefaced by my own labelling between square brackets:

[risk identification, evaluation and mitigation] Take appropriate measures throughout the development of advanced AI systems, including prior to and throughout their deployment and placement on the market, to identify, evaluate, and mitigate risks across the AI lifecycle;
[misuse monitoring] Patterns of misuse, after deployment including placement on the market;
[transparency and accountability] Publicly report advanced AI systems’ capabilities, limitations and domains of appropriate and inappropriate use, to support ensuring sufficient transparency, thereby contributing to increase accountability.
[incident intelligence exchange] Work towards responsible information sharing and reporting of incidents among organizations developing advanced AI systems including with industry, governments, civil society, and academia.
[risk management governance] Develop, implement and disclose AI governance and risk management policies, grounded in a risk-based approach – including privacy policies, and mitigation measures, in particular for organizations developing advanced AI systems.
[(cyber) security] Invest in and implement robust security controls, including physical security, cybersecurity and insider threat safeguards across the AI lifecycle.
[content authentication and watermarking] Develop and deploy reliable content authentication and provenance mechanisms, where technically feasible, such as watermarking or other techniques to enable users to identify AI-generated content.
[risk mitigation priority] Prioritize research to mitigate societal, safety and security risks and prioritize investment in effective mitigation measures.
[grand challenges priority] Prioritize the development of advanced AI systems to address the world’s greatest challenges, notably but not limited to the climate crisis, global health and education.
[technical standardisation] Advance the development of and, where appropriate, adoption of international technical standards.
[personal data and IP safeguards] Implement appropriate data input measures and protections for personal data and intellectual property.

Each of the principles is accompanied by additional guidance or precision, where possible, and this is further developed in the G7 Code of Conduct.

In my view, the list is a bit of a mixed bag.

There are some very general aspirations or steers that can hardly be considered principles of AI regulation, for example principle 9 setting a grand challenges priority and, possibly, principle 8 setting a risk mitigation priority beyond the ‘requirements’ of principle 1 on risk identification, evaluation and mitigation—which thus seems to boil down to the more specific steer in the G7 Code of Conduct for (private) organisations to ‘share research and best practices on risk mitigation’.

Quite how these principles could be complied by current major AI developers seems rather difficult to foresee, especially in relation to principle 9. Most developers of generative AI or other AI applications linked to eg social media platforms will have a hard time demonstrating their engagement with this principle, unless we accept a general justification of ‘general purpose application’ or ‘dual use application’—which to me seems quite unpalatable. What is the purpose of this principle if eg it pushes organisations away from engaging with the rest of the G7 AI Principles? Or if organisations are allowed to gloss over it in any (future) disclosures linked to an eventual mechanism of commitment, chartering, or labelling associated with the principles? It seems like the sort of purely political aspiration that may have been better left aside.

Some other principles seem to push at an open door, such as principle 10 on the development of international technical standards. Again, the only meaningful detail seems to be in the G7 Code of Conduct, which specifies that ‘In particular, organizations also are encouraged to work to develop interoperable international technical standards and frameworks to help users distinguish content generated by AI from non-AI generated content.’ However, this is closely linked to principle 7 on content authentication and watermarking, so it is not clear how much that adds. Moreover, this comes to further embed the role of industry-led technical standards as a foundational element of AI regulation, with all the potential problems that arise from it (for some discussion from the perspective of regulatory tunnelling, see here and here).

Yet other principles present as relatively soft requirements or ‘noble’ commitments issues that are, in reality, legal requirements already binding on entities and States and that, in my view, should have been placed as hard obligations and a renewed commitment from G7 States to enforce them. These include principle 11 on personal data and IP safeguards, where the G7 Code of Conduct includes as an apparent after thought that ‘Organizations should also comply with applicable legal frameworks’. In my view, this should be starting point.

This reduces the list of AI Principles ‘proper’. But, even then, they can be further grouped and synthesised, in my view. For example, principles 1 and 5 are both about risk management, with the (outward-looking) governance layer of principle 5 seeking to give transparency to the (inward-looking) governance layer in principle 1. Principle 2 seems to simply seek to extend the need to engage with risk-based management post-market placement, which is also closely connected to the (inward-looking) governance layer in principle 1. All of them focus on the (undefined) risk-based approach to development and deployment of AI underpinning the G7’s AI Principles and Code of Conduct.

Some aspects of the incident intelligence exchange also relate to principle 1, while some other aspects relate to (cyber) security issues encapsulated in principle 6. However, given that this principle may be a placeholder for the development of some specific mechanisms of collaboration—either based on cyber security collaboration or other approaches, such as the much touted aviation industry’s—it may be treated separately.

Perhaps, then, the ‘core’ AI Principles arising from the G7 document could be trimmed down to:

Life-cycle risk-based management and governance, inclusive of principles 1, 2, and 5.
Transparency and accountability, principle 3.
Incident intelligence exchange, principle 4.
(Cyber) security, principle 6.
Content authentication and watermarking, principle 7 (though perhaps narrowly targeted to generative AI).

Most of the value in the G7 AI Principles and Code of Conduct thus arises from the pointers for collaboration, the more detailed self-regulatory measures, and the more specific potential commitments included in the latter. For example, in relation to the potential AI risks that are identified as potential targets for the risk assessments expected of AI developers (under guidance related to principle 1), or the desirable content of AI-related disclosures (under guidance related to principle 3).

It is however unclear how these principles will evolve when adopted at the national level, and to what extent they offer a sufficient blueprint to ensure international coherence in the development of the ‘more enduring and/or detailed governance and regulatory approaches’ envisaged by G7 leaders. It seems for example striking that both the EU and the UK have supported these principles, given that they have relatively opposing approaches to AI regulation—with the EU seeking to finalise the legislative negotiations on the first ‘golden standard’ of AI regulation and the UK taking an entirely deregulatory approach. Perhaps this is in itself an indication that, even at the level of detail achieved in the G7 AI Code of Conduct, the regulatory leeway is quite broad and still necessitates significant further concretisation for it to be meaningful in operational terms—as evidenced eg by the US President’s ‘Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence’, which calls for that concretisation and provides a good example of the many areas for detailed work required to translate high level principles into actionable requirements (even if it leaves enforcement still undefined).

How do the G7 Principles compare to the UK’s ‘pro-innovation’ ones?

In March 2023, the UK Government published its white paper ‘A pro-innovation approach to AI regulation’ (the ‘UK AI White Paper’; for a critique, see here). The UK AI White Paper indicated (at para 10) that its ‘framework is underpinned by five principles to guide and inform the responsible development and use of AI in all sectors of the economy:

Safety, security and robustness
Appropriate transparency and explainability
Fairness
Accountability and governance
Contestability and redress’.

A comparison of the UK and the G7 principles can show a few things.

First, that there are some areas where there seems to be a clear correlation—in particular concerning (cyber) security as a self-standing challenge requiring a direct regulatory focus.

Second, that it is hard to decide at which level to place incommensurable aspects of AI regulation. Notably, the G7 principles do not directly refer to fairness—while the UK does. However, the G7 Principles do spend some time in the preamble addressing the issue of fairness and unacceptable AI use (though in a woolly manner). Whether placing this type of ‘requirement’ at a level or other makes a difference (at all) is highly debatable.

Third, that there are different ways of ‘packaging’ principles or (soft) obligations. Just like some of the G7 principles are closely connected or fold into each other (as above), so do the UK’s principles in relation to the G7’s. For example, the G7 packaged together transparency and accountability (principle 3), while the UK had them separated. While the UK explicitly mentioned the issue of AI explainability, this remains implicit in the G7 principles (also in principle 3).

Finally, in line with the considerations above, that distinct regulatory approaches only emerge or become clear once the ‘principles’ become specific (so they arguably stop being principles). For example, it seems clear that the G7 Principles aspire to higher levels of incident intelligence governance and to a specific target of generative AI watermarking than the UK’s. However, whether the G7 or the UK principles are equally or more demanding on any other dimension of AI regulation is close to impossible to establish. In my view, this further supports the need for a much more detailed AI regulatory framework—else, technical standards will entirely occupy that regulatory space.

What do the G7 AI Principles tell us about the UK’s AI Safety Summit?

The Hiroshima Process that has led to the adoption of the G7 AI Principles and Code of Conduct emerged from the Ministerial Declaration of The G7 Digital and Tech Ministers’ Meeting of 30 April 2023, which explicitly stated that:

‘Given that generative AI technologies are increasingly prominent across countries and sectors, we recognise the need to take stock in the near term of the opportunities and challenges of these technologies and to continue promoting safety and trust as these technologies develop. We plan to convene future G7 discussions on generative AI which could include topics such as governance, how to safeguard intellectual property rights including copyright, promote transparency, address disinformation, including foreign information manipulation, and how to responsibly utilise these technologies’ (at para 47).

The UK Government’s ambitions for the AI Safety Summit largely focus on those same issues, albeit within the very narrow confines of ‘frontier AI’, which it has defined as ‘highly capable general-purpose AI models that can perform a wide variety of tasks and match or exceed the capabilities present in today’s most advanced models‘. While the UK Government has published specific reports to focus discussion on (1) Capabilities and risks from frontier AI and (2) Emerging Processes for Frontier AI Safety, it is unclear how the level of detail of such narrow approach could translate into broader international commitments.

The G7 AI Principles already claim to tackle ‘the most advanced AI systems, including the most advanced foundation models and generative AI systems (henceforth "advanced AI systems")’ within their scope. It seems unclear that such approach would be based on a lack of knowledge or understanding of the detail the UK has condensed in those reports. It rather seems that the G7 was not ready to move quickly to a level of detail beyond that included in the G7 AI Code of Conduct. Whether significant further developments can be expected beyond the G7 AI Principles and Code of Conduct just two days after they were published seems hard to fathom.

Moreover, although the UK Government is downplaying the fact that eg Chinese participation in the AI Safety Summit is unclear and potentially rather marginal, it seems that, at best, the UK AI Safety Summit will be an opportunity for a continued conversation between G7 countries and a few others. It is also unclear whether significant progress will be made in a forum that seems rather clearly tilted towards industry voice and influence.

Let’s wait and see what the outcomes are, but I am not optimistic for significant progress other than, worryingly, a risk of further displacement of regulatory decision-making towards industry and industry-led (future) standards.

AI in the public sector: can procurement promote trustworthy AI and avoid commercial capture?

July 6, 2023

The recording and slides of the public lecture on ‘AI in the public sector: can procurement promote trustworthy AI and avoid commercial capture?’ I gave at the University of Bristol Law School on 4 July 2023 are now available. As always, any further comments most warmly received at: a.sanchez-graells@bristol.ac.uk.

This lecture brought my research project to an end. I will now focus on finalising the manuscript and sending it off to the publisher, and then take a break for the rest of the summer. I will share details of the forthcoming monograph in a few months. I hope to restart blogging in September. in the meantime, I wish all HTCaN friends all the best. Albert

Two policy briefings on digital technologies and procurement

June 21, 2023

Now that my research project ‘Digital technologies and public procurement. Gatekeeping and experimentation in digital public governance’ nears its end, some outputs start to emerge. In this post, I would like to highlight two policy briefings summarising some of my top-level policy recommendations, and providing links to more detailed analysis. All materials are available in the ‘Digital Procurement Governance’ tab.

Policy Briefing 1: ‘Guaranteeing public sector adoption of trustworthy AI - a task that should not be left to procurement’

Policy Briefing 2: ‘Successful procurement digitalisation requires more data, in-house expertise, and improved governance mechanisms’

What's the rush -- some thoughts on the UK's Foundation Model Taskforce and regulation by Twitter

June 19, 2023

I have been closely following developments on AI regulation in the UK, as part of the background research for the joint submission to the public consultation closing on Wednesday (see here and here). Perhaps surprisingly, the biggest developments do not concern the regulation of AI under the devolved model described in the ‘pro-innovation’ white paper, but its displacement outside existing regulatory regimes—both in terms of funding, and practical power.

Most of the activity and investments are not channelled towards existing resource-strained regulators to support them in their task of issuing guidance on how to deal with AI risks and harms—which stems from the white paper—but in digital industrial policy and R&D projects, including a new major research centre on responsible and trustworthy AI and a Foundation Model Taskforce. A first observation is that this type of investments can be worthwhile, but not at the expense of adequately resourcing regulators facing the tall order of AI regulation.

The UK’s Primer Minister is clearly making a move to use ‘world-leadership in AI safety’ as a major plank of his re-election bid in the coming Fall. I am not only sceptical about this move and its international reception, but also increasingly concerned about a tendency to ‘regulate by Twitter’ and to bullish approaches to regulatory and legal compliance that could well result in squandering a good part of the £100m set aside for the Taskforce.

In this blog, I offer some preliminary thoughts. Comments welcome!

Twitter announcements vs white paper?

During the preparation of our response to the AI public consultation, we had a moment of confusion. The Government published the white paper and an impact assessment supporting it, which primarily amount to doing nothing and maintaining the status quo (aka AI regulatory gap) in the UK. However, there were increasing reports of the Prime Minister’s change of heart after the emergence of a ‘doomer’ narrative peddled by OpenAI’s CEO and others. At some point, the PM sent out a tweet that made us wonder if the Government was changing policy and the abandoning the approach of the white paper even before the end of the public consultation. This was the tweet.

We could not locate any document describing the ‘Safe strategy of AI’, so the only conclusion we could reach is that the ‘strategy’ was the short twitter threat that followed that first tweet.

It was not only surprising that there was no detail, but also that there was no reference to the white paper or to any other official policy document. We were probably not the only ones confused about it (or so we hope!) as it is in general very confusing to have social media messaging pointing out towards regulatory interventions completely outside the existing frameworks—including live public consultations by the government!

It is also confusing to see multiple different documents make reference to different things, and later documents somehow reframing what previous documents mean.

For example, the announcement of the Foundation Model Taskforce came only a few weeks after the publication of the white paper, but there was no mention of it in the white paper itself. Is it possible that the Government had put together a significant funding package and related policy in under a month? Rather than whether it is possible, the question is why do things in this way? And how mature was the thinking behind the Taskforce?

For example, the initial announcement indicated that

The investment will build the UK’s ‘sovereign’ national capabilities so our public services can benefit from the transformational impact of this type of AI. The Taskforce will focus on opportunities to establish the UK as a world leader in foundation models and their applications across the economy, and acting as a global standard bearer for AI safety.

The funding will be invested by the Foundation Model Taskforce in foundation model infrastructure and public service procurement, to create opportunities for domestic innovation. The first pilots targeting public services are expected to launch in the next six months.

Less than two months later, the announcement of the appointment of the Taskforce chair (below) indicated that

… a key focus for the Taskforce in the coming months will be taking forward cutting-edge safety research in the run up to the first global summit on AI safety to be hosted in the UK later this year.

Bringing together expertise from government, industry and academia, the Taskforce will look at the risks surrounding AI. It will carry out research on AI safety and inform broader work on the development of international guardrails, such as shared safety and security standards and infrastructure, that could be put in place to address the risks.

Is it then a Taskforce and pot of money seeking to develop sovereign capabilities and to pilot public sector AI use, or a Taskforce seeking to develop R&D in AI safety? Can it be both? Is there money for both? Also, why steer the £100m Taskforce in this direction and simultaneously spend £31m in funding an academic-led research centre on ethical and trustworthy AI? Is the latter not encompassing issues of AI safety? How will all of these investments and initiatives be coordinated to avoid duplication of effort or replication of regulatory gaps in the disparate consideration of regulatory issues?

Funding and collaboration opportunities announced via Twitter?

Things can get even more confusing or worrying (for me). Yesterday, the Government put out an official announcement and heavy Twitter-based PR to announce the appointment of the Chair of the Foundation Model Taskforce. This announcement raises a few questions. Why on Sunday? What was the rush? Also, what was the process used to select the Chair, if there was one? I have no questions on the profile and suitability of the appointed Chair (have also not looked at them in detail), but I wonder … even if legally compliant to proceed without a formal process with an open call for expressions of interest, is this appropriate? Is the Government stretching the parallelism with the Vaccines Taskforce too far?

Relatedly, there has been no (or I have been unable to locate) official call for expressions of interest from those seeking to get involved with the Taskforce. However, once more, Twitter seems to have been the (pragmatic?) medium used by the newly appointed Chair of the Taskforce. On Sunday itself, this Twitter thread went out:

I find the last bit particularly shocking. A call for expressions of interest in participating in a project capable of spending up to £100m via Google Forms! (At the time of writing), the form is here and its content is as follows:

I find this approach to AI regulation rather concerning and can also see quite a few ways in which the emerging work approach can lead to breaches of procurement law and subsidies controls, or recruitment processes (depending on whether expressions of interest are corporate or individual). I also wonder what is the rush with all of this and what sort of record-keeping will be kept of all this so that it there is adequate accountability of this expenditure. What is the rush?

Or rather, I know that the rush is simply politically-driven and that this is another way in which public funds are at risk for the wrong reasons. But for the entirely arbitrary deadline of the ‘world AI safety summit’ the PM wants to host in the UK in the Fall — preferably ahead of any general election, I would think — it is almost impossible to justify the change of gear between the ‘do nothing’ AI white paper and the ‘rush everything’ approach driving the Taskforce. I hope we will not end up in another set of enquiries and reports, such as those stemming from the PPE procurement scandal or the ventilator challenge, but it is hard to see how this can all be done in a legally compliant manner, and with the serenity. clarity of view and long-term thinking required of regulatory design. Even in the field of AI. Unavoidably, more to follow.

Response to the UK’s March 2023 White Paper "A pro-innovation approach to AI regulation"

June 19, 2023

Together with colleagues at the Centre for Global Law and Innovation of the University of Bristol Law School, I submitted a response to the UK Government’s public consultation on its ‘pro-innovation’ approach to AI regulation. For an earlier assessment, see here.

The full submission is available at https://ssrn.com/abstract=4477368, and this is the executive summary:

The white paper ‘A pro-innovation approach to AI regulation’ (the ‘AI WP’) claims to advance a ‘pro-innovation, proportionate, trustworthy, adaptable, clear and collaborative’ model that leverages the capabilities and skills of existing regulators to foster AI innovation. This model, we are told, would be underpinned by a set of principles providing a clear, unified, and flexible framework improving upon the current ‘complex patchwork of legal requirements’ and striking ‘the right balance between responding to risks and maximising opportunities.’

In this submission, we challenge such claims in the AI WP. We argue that:

The AI WP does not advance a balanced and proportionate approach to AI regulation, but rather, an “innovation first” approach that caters to industry and sidelines the public. The AI WP primarily serves a digital industrial policy goal ‘to make the UK one of the top places in the world to build foundational AI companies’. The public interest is downgraded and building public trust is approached instrumentally as a mechanism to promote AI uptake. Such an approach risks breaching the UK’s international obligations to create a legal framework that effectively protects fundamental rights in the face of AI risks. Additionally, in the context of public administration, poorly regulated AI could breach due process rules, putting public funds at risk.
The AI WP does not embrace an agile regulatory approach, but active deregulation. The AI WP stresses that the UK ‘must act quickly to remove existing barriers to innovation’ without explaining how any of the existing safeguards are no longer required in view of identified heightened AI risks. Coupled with the “innovation first” mandate, this deregulatory approach risks eroding regulatory independence and the effectiveness of the regulatory regimes the AI WP claims to seek to leverage. A more nuanced regulatory approach that builds on, rather than threatens, regulatory independence is required.
The AI WP builds on shaky foundations, including the absence of a mapping of current regulatory remits and powers. This makes it near impossible to assess the effectiveness and comprehensiveness of the proposed approach, although there are clear indications that regulatory gaps will remain. The AI WP also presumes continuity in the legal framework, which ignores reforms currently promoted by Government and further reforms of the overarching legal regime repeatedly floated. It seems clear that some regulatory regimes will soon see their scope or stringency limited. The AI WP does not provide clear mechanisms to address these issues, which undermine its core claim that leveraging existing regulatory regimes suffices to address potential AI harms. This is perhaps particularly evident in the context of AI use for policing, which is affected by both the existence of regulatory gaps and limitations in existing legal safeguards.
The AI WP does not describe a full, workable regulatory model. Lack of detail on the institutional design to support the central function is a crucial omission. Crucial tasks are assigned to such central function without clarifying its institutional embedding, resourcing, accountability mechanisms, etc.
The AI WP foresees a government-dominated approach that further risks eroding regulatory independence, in particular given the “innovation first” criteria to be used in assessing the effectiveness of the proposed regime.
The principles-based approach to AI regulation suggested in the AI WP is undeliverable due to lack of detail on the meaning and regulatory implications of the principles, barriers to translation into enforceable requirements, and tensions with existing regulatory frameworks. The minimalistic legislative intervention entertained in the AI WP would not equip regulators to effectively enforce the general principles. Following the AI WP would also result in regulatory fragmentation and uncertainty and not resolve the identified problem of a ‘complex patchwork of legal requirements’.
The AI WP does not provide any route towards sufficiently addressing the digital capabilities gap, or towards mitigating new risks to capabilities, such as deskilling—which create significant constraints on the likely effectiveness of the proposed approach.

Full citation: A Charlesworth, K Fotheringham, C Gavaghan, A Sanchez-Graells and C Torrible, ‘Response to the UK’s March 2023 White Paper "A pro-innovation approach to AI regulation"’ (June 19, 2023). Available at SSRN: https://ssrn.com/abstract=4477368.

"Can Procurement Be Used to Effectively Regulate AI?" [recording]

May 31, 2023

The recording and slides for yesterday’s webinar on ‘Can Procurement Be Used to Effectively Regulate AI?’ co-hosted by the University of Bristol Law School and the GW Law Government Procurement Programme are now available for catch up if you missed it.

I would like to thank once again Dean Jessica Tillipman (GW Law), Dr Aris Georgopoulos (Nottingham), Elizabeth "Liz" Chirico (Acquisition Innovation Lead at Office of the Deputy Assistant Secretary of the Army - Procurement) and Scott Simpson (Digital Transformation Lead, Department of Homeland Security Office of the Chief Procurement Officer - Procurement Innovation Lab) for really interesting discussion, and to all participants for their questions. Comments most welcome, as always.

ChatGPT in the Public Sector -- should it be banned?

May 3, 2023

In ‘ChatGPT in the Public Sector – overhyped or overlooked?’ (24 Apr 2023), the Analysis and Research Team (ART) of the General Secretariat of the Council of the European Union provides a useful and accessible explanation of how ChatGPT works, as well interesting analysis of the risks and pitfalls of rushing to embed generative artificial intelligence (GenAI), and large language models (LLMs) in particular, in the functioning of the public administration.

The analysis stresses the risks stemming from ‘inaccurate, biased, or nonsensical’ GenAI outputs and, in particular, that ‘the key principles of public administration such as accountability, transparency, impartiality, or reliability need to be considered thoroughly in the [GenAI] integration process’.

The paper provides a helpful introduction to how LLMs work and their technical limitations. It then maps potential uses in the public administration, assesses the potential impact of their use on the European principles of public sector administration, and then suggests some measures to mitigate the relevant risks.

This analysis is helpful but, in my view, it is already captured by the presumption that LLMs are here to stay and that what regulators can do is just try to minimise their potential negative impacts—which implies accepting that there will remain unaddressed impacts. By referring to general principles of public administration, rather than eg the right to good administration under the EU Charter of Fundamental Rights, the analysis is also unnecessarily lenient.

I find this type of discourse dangerous and troubling because it facilitates the adoption of digital technologies that cannot meet current legal requirements and guarantees of individual rights. This is clear from the paper itself, although the implications of part of the analysis are not sufficiently explored, in my view.

The paper has a final section where it explicitly recognises that, while some risks might be mitigated by technological advancements, other risks are of a more structural nature and cannot be fully corrected despite best efforts. The paper then lists a very worrying panoply of such structural issues (at 16):

‘This is the case for detecting and removing biases in training data and model outputs. Efforts to sanitize datasets can even worsen biases’.
‘Related to biases is the risk of a perpetuation of the status quo. LLMs mirror the values, habits and attitudes that are present in their training data, which does not leave much space for changing or underrepresented societal views. Relying on LLMs that have been trained with previously produced documents in a public administration severely limits the scope for improvement and innovation and risks leaving the public sector even less flexible than it is already perceived to be’.
‘The ‘black box’ issue, where AI models arrive at conclusions or decisions without revealing the process of how they were reached is also primarily structural’.
‘Regulating new technologies will remain a cat-and-mouse game. Acceleration risk (the emergence of a race to deploy new AI as quickly as possible at the expense of safety standards) is also an area of concern’.
‘Finally […] a major structural risk lies in overreliance, which may be bolstered by rapid technological advances. This could lead to a lack of critical thinking skills needed to adequately assess and oversee the model’s output, especially amongst a younger generation entering a workforce where such models are already being used’.

In my view, beyond the paper’s suggestion that the way forward is to maintain human involvement to monitor the way LLMs (mal)function in the public sector, we should be discussing the imposition of a ban on the adoption of LLMs (and other digital technologies) by the public sector unless it can be positively proven that their deployment will not affect individual rights and more diffuse public interests, and that any residual risks are adequately mitigated.

The current state of affairs is unacceptable in that the lack of regulation allows for a quickly accelerating accumulation of digital deployments that generate risks to social and individual rights and goods. The need to reverse this situation underlies my proposal to permission the adoption of digital technologies by the public sector. Unless we take a robust approach to slowing down and carefully considering the implications of public sector digitalisation, we may be undermining public governance in ways that will be very difficult or impossible to undo. It is not too late, but it may be soon.

Source: https://www.thetimes.co.uk/article/how-we-...

Free registration open for two events on procurement and artificial intelligence

April 21, 2023

Registration is now open for two free events on procurement and artificial intelligence (AI).

First, a webinar where I will be participating in discussions on the role of procurement in contributing to the public sector’s acquisition of trustworthy AI, and the associated challenges, from an EU and US perspective.

Second, a public lecture where I will present the findings of my research project on digital technologies and public procurement.

Please scroll down for details and links to registration pages. All welcome!

1. ‘Can Procurement Be Used to Effectively Regulate AI?’ | Free online webinar
30 May 2023 2pm BST / 3pm CET-SAST / 9am EST (90 mins)
Co-organised by University of Bristol Law School and George Washington University Law School.

Artificial Intelligence (“AI”) regulation and governance is a global challenge that is starting to generate different responses in the EU, US, and other jurisdictions. Such responses are, however, rather tentative and politically contested. A full regulatory system will take time to crystallise and be fully operational. In the meantime, despite this regulatory gap, the public sector is quickly adopting AI solutions for a wide range of activities and public services.

This process of accelerated AI adoption by the public sector places procurement as the (involuntary) gatekeeper, tasked with ‘AI regulation by contract’, at least for now. The procurement function is expected to design tender procedures and contracts capable of attaining goals of AI regulation (such as trustworthiness, explainability, or compliance with data protection and human and fundamental rights) that are so far eluding more general regulation.

This webinar will provide an opportunity to take a hard look at the likely effectiveness of AI regulation by contract through procurement and its implications for the commercialisation of public governance, focusing on key issues such as:

The interaction between tender design, technical standards, and negotiations.
The challenges of designing, monitoring, and enforcing contractual clauses capable of delivering effective ‘regulation by contract’ in the AI space.
The tension between the commercial value of tailored contractual design and the regulatory value of default clauses and standard terms.
The role of procurement disputes and litigation in shaping AI regulation by contract.
The alternative regulatory option of establishing mandatory prior approval by an independent regulator of projects involving AI adoption by the public sector.

This webinar will be of interest to those working on or researching the digitalisation of the public sector and AI regulation in general, as the discussion around procurement gatekeeping mirrors the main issues arising from broader trends.

I will have the great opportunity of discussing my research with Aris Georgopoulos (Nottingham), Scott Simpson (Digital Transformation Lead at U.S. Department of Homeland Security), and Liz Chirico (Acquisition Innovation Lead at Office of the Deputy Assistant Secretary of the Army). Jessica Tillipman (GW Law) will moderate the discussion and Q&A.

Registration: https://law-gwu-edu.zoom.us/webinar/register/WN_w_V9s_liSiKrLX9N-krrWQ.

2. ‘AI in the public sector: can procurement promote trustworthy AI and avoid commercial capture?’ | Free in-person public lecture
4 July 2023 2pm BST, Reception Room, Wills Memorial Building, University of Bristol
Organised by University of Bristol Law School, Centre for Global Law and Innovation

The public sector is quickly adopting artificial intelligence (AI) to manage its interactions with citizens and in the provision of public services – for example, using chatbots in official websites, automated processes and call-centres, or predictive algorithms.

There are inherent high stakes risks to this process of public governance digitalisation, such as bias and discrimination, unethical deployment, data and privacy risks, cyber security risks, or risks of technological debt and dependency on proprietary solutions developed by (big) tech companies.

However, as part of the UK Government’s ‘light touch’ ‘pro-innovation’ approach to digital technology regulation, the adoption of AI in the public sector remains largely unregulated.

In this public lecture, I will present the findings of my research funded by the British Academy, analysing how, in this deregulatory context, the existing rules on public procurement fall short of protecting the public interest.

An alternative approach is required to create mechanisms of external independent oversight and mandatory standards to embed trustworthy AI requirements and to mitigate against commercial capture in the acquisition of AI solutions.

Registration: https://www.eventbrite.co.uk/e/can-procurement-promote-trustworthy-ai-and-avoid-commercial-capture-tickets-601212712407.