Did you use AI to write this tender? What? Just asking! -- Also, how will you use AI to deliver this contract?

The UK’s Cabinet Office has published procurement policy note 2/24 on ‘Improving Transparency of AI use in Procurement’ (the ‘AI PPN’) because ‘AI systems, tools and products are part of a rapidly growing and evolving market, and as such, there may be increased risks associated with their adoption … [and therefore] it is essential to take steps to identify and manage associated risks and opportunities, as part of the Government’s commercial activities’.

The crucial risk the AI PPN seems to be concerned with relates to generative AI ‘hallucinations’, as it includes background information highlighting that:

‘Content created with the support of Large Language Models (LLMs) may include inaccurate or misleading statements; where statements, facts or references appear plausible, but are in fact false. LLMs are trained to predict a “statistically plausible” string of text, however statistical plausibility does not necessarily mean that the statements are factually accurate. As LLMs do not have a contextual understanding of the question they are being asked, or the answer they are proposing, they are unable to identify or correct any errors they make in their response. Care must be taken both in the use of LLMs, and in assessing returns that have used LLMs, in the form of additional due diligence.’

The PPN has the main advantage of trying to tackle the challenge of generative AI in procurement head on. It can help raise awareness in case someone was not yet talking about this and, more seriously, it includes an Annex A that brings together the several different bits of guidance issued by the UK government to date. However, the AI PPN does not elaborate on any of that guidance and is thus as limited as the Guidelines for AI procurement (see here), relatively complicated in that it points to rather different types of guidance ranging from ethics, to legal, to practical considerations, and requires significant knowledge and expertise to be operationalised (see here). Perhaps the best evidence of the complexity of the mushrooming sets of guidance is that the PPN itself includes in Annex A a reference to the January 2024 Guidance to civil servants on use of generative AI, which has been superseded by the Generative AI Framework for HMG, to which it also refers in Annex A. In other words, the AI PPN is not a ‘plug-and-play’ document setting out how to go about dealing with AI hallucinations and other risks in procurement. And given the pace of change in this area, it is also bound to be a PPN that requires multiple revisions and adaptations going forward.

A screenshot showing that the January guidance on generative AI use has been superseded (taken on 26 March 2024 10:20am).

More generally, the AI PPN is bound to be controversial and has already spurred insightful discussion on LinkedIn. I would recommend the posts by Kieran McGaughey and Ian Makgill. I offer some additional thoughts here and look forward to continuing the conversation.

In my view, one of the potential issues arising from the AI PPN is that it aims to cover quite a few different aspects of AI in procurement, as well as neglecting others. Slightly simplifying, there are three broad areas of AI-procurement interaction. First, there is the issue of buying AI-based solutions or services. Second, there is the issue of tenderers using (generative) AI to write or design their tenders. Third, there is the issue of the use of AI by contracting authorities, eg in relation to qualitative selection/exclusion, or evaluation/award decisions. The AI PPN covers aspects of . However, it is not clear to me that these can be treated together, as they pose significantly different policy issues. I will try to disentangle them here.

Buying and using AI

Although it mainly cross-refers to the Guidelines for AI procurement, the AI PPN includes some content relevant to the procurement and use of AI when it stresses that ‘Commercial teams should take note of existing guidance when purchasing AI services, however they should also be aware that AI and Machine Learning is becoming increasingly prevalent in the delivery of “non-AI” services. Where AI is likely to be used in the delivery of a service, commercial teams may wish to require suppliers to declare this, and provide further details. This will enable commercial teams to consider any additional due diligence or contractual amendments to manage the impact of AI as part of the service delivery.’ This is an adequate and potentially helpful warning. However, as discussed below, the PPN suggests a way to go about it that is in my view wrong and potentially very problematic.

AI-generated tenders

The AI PPN is however mostly concerned with the use of AI for tender generation. It recognises that there ‘are potential benefits to suppliers using AI to develop their bids, enabling them to bid for a greater number of public contracts. It is important to note that suppliers’ use of AI is not prohibited during the commercial process but steps should be taken to understand the risks associated with the use of AI tools in this context, as would be the case if a bid writer has been used by the bidder.’ It indicates some potential steps contracting authorities can take, such as:

  • ‘Asking suppliers to disclose their use of AI in the creation of their tender.’

  • ‘Undertaking appropriate and proportionate due diligence:

    • If suppliers use AI tools to create tender responses, additional due diligence may be required to ensure suppliers have the appropriate capacity and capability to fulfil the requirements of the contract. Such due diligence should be proportionate to any additional specific risk posed by the use of AI, and could include site visits, clarification questions or supplier presentations.

    • Additional due diligence should help to establish the accuracy, robustness and credibility of suppliers’ tenders through the use of clarifications or requesting additional supporting documentation in the same way contracting authorities would approach any uncertainty or ambiguity in tenders.’

  • ‘Potentially allowing more time in the procurement to allow for due diligence and an increase in volumes of responses.’

  • ‘Closer alignment with internal customers and delivery teams to bring greater expertise on the implications and benefits of AI, relative to the subject matter of the contract.’

In my view, there are a few problematic aspects here. While the AI PPN seems to try not to single out the use of generative AI as potentially problematic by equating it to the possible use of (human) bid writers, this is unconvincing. First, because there is (to my knowledge) no guidance whatsoever on an assessment of whether bid writers have been used, and because the AI PPN itself does not require disclosure of the engagement of bid writers (o puts any thought on the fact that third-party bid writers ma have used AI without this being known to the hiring tenderer, which would then require an extension of the disclosure of AI use further down the tender generation chain). Second, because the approach taken in the AI PP seems to point at potential problems with the use of (external, third-party) bid writers, whereas it does not seem to object to the use of (in-house) bid writers, potentially by much larger economic operators, which seems to presumptively not generate issues. Third, and most importantly, because it shows that perhaps not enough has been done so far to tackle the potential deceit or provision of misleading information in tenders if contracting authorities must now start thinking about how to get expert-based analysis of tenders, or develop fact-checking mechanisms to ensure bids are truthful. You would have thought that regardless of the origin of a tender, contracting authorities should be able to check their content to an adequate level of due diligence already.

In any case, the biggest issue with the AI PPN is how it suggests contracting authorities should deal with this issue, as discussed below.

AI-based assessments

The AI PPN also suggests that contracting authorities should be ‘Planning for a general increase in activity as suppliers may use AI to streamline or automate their processes and improve their bid writing capability and capacity leading to an increase in clarification questions and tender responses.’ One of the possibilities could be for contracting authorities to ‘fight fire with fire’ and also deploy generative AI (eg to make summaries, to scan for errors, etc). Interestingly, though, the AI PPN does not directly refer to the potential use of (generative) AI by contracting authorities.

While it includes a reference in Annex A to the Generative AI framework for HM Government, that document does not specifically address the use of generative AI to manage procurement processes (and what it says about buying generative AI is redundant given the other guidance in the Annex). In my view, the generative AI framework pushes strongly against the use of AI in procurement when it identifies a series of use cases to avoid (page 18) that include contexts where high-accuracy and high-explainability are required. If this is the government’s (justified) view, then the AI PPN has been a missed opportunity to say this more clearly and directly.

The broader issue of confidential, classified or proprietary information

Both in relation to the procurement and use of AI, and the use of AI for tender generation, the AI PPN stresses that it may be necessary:

  • ‘Putting in place proportionate controls to ensure bidders do not use confidential contracting authority information, or information not already in the public domain as training data for AI systems e.g. using confidential Government tender documents to train AI or Large Language Models to create future tender responses.‘; and that

  • ‘In certain procurements where there are national security concerns in relation to use of AI by suppliers, there may be additional considerations and risk mitigations that are required. In such instances, commercial teams should engage with their Information Assurance and Security colleagues, before launching the procurement, to ensure proportionate risk mitigations are implemented.’

These are issues that can easily exceed the technical capabilities of most contracting authorities. It is very hard to know what data has been used to train a model and economic operators using ‘off-the-shelf’ generative AI solutions will hardly be in a position to assess themselves, or provide any meaningful information, to contracting authorities. While there can be contractual constraints on the use of information and data generated under a given contract, it is much more challenging to assess whether information and data has been inappropriately used at a different link of increasingly complex digital supply chains. And, in any case, this is not only an issue for future contracts. Data and information generated under contracts already in place may not be subject to adequate data governance frameworks. It would seem that a more muscular approach to auditing data governance issues may be required, and that this should not be devolved to the procurement function.

How to deal with it? — or where the PPN goes wrong

The biggest weakness in the AI PPN is in how it suggests contracting authorities should deal with the issue of generative AI. In my view, it gets it wrong in two different ways. First, by asking for too much non-scored information where contracting authorities are unlikely to be able to act on it without breaching procurement and good administration principles. Second, by asking for too little non-scored information that contracting authorities are under a duty to score.

Too much information

The AI PPN includes two potential (alternative) disclosure questions in relation to the use of generative AI in tender writing (see below Q1 and Q2).

I think these questions miss the mark and expose contracting authorities to risks of challenge on grounds of a potential breach of the principle of equal treatment and the duty of good administration. The potential breach of the duty of good administration could be on grounds that the contracting authority is taking irrelevant information into account in the assessment of the relevant tender. The potential breach of equal treatment could come if tenders with some AI-generated elements were subjected to significantly more scrutiny than tenders where no AI was used. Contracting authorities should subject all tenders to the same level of due diligence and scrutiny because, at the bottom of it, there is no reason to ‘take a tenderer at its word’ when no AI is used. That is the entire logic of the exclusion, qualitative selection and evaluation processes.

Crucially, though, what the questions seem to really seek to ascertain is that the tenderer has checked for and confirms the accuracy of the content of the tender and thus makes the content its own and takes responsibility for it. This could be checked generally by asking all tenderers to confirm that the content of their tenders is correct and a true reflection of their capabilities and intended contractual delivery, reminding them that contracting authorities have tools to sanction economic operators that have ‘negligently provided misleading information that may have a material influence on decisions concerning exclusion, selection or award’ (reg.57(8)(i)(ii) PCR2015 and sch.7 13(2)(b) PA2023). And then enforcing them!

Checking the ‘authenticity’ of tenders when in fact contracting authorities are meant to check their truthfulness, accuracy and deliverability would be a false substitution of the relevant duties. It would also potentially eschew the incentives to disclose use of AI generation (lest contracting authorities find a reliable way of identifying it themselves and start applying the exclusion grounds above)—as thoroughly discussed in the LinkedIn posts referred to above.

too little information

Conversely, the PPN takes too soft and potentially confusing an approach to the use of AI to deliver the contract. The proposed disclosure question (Q3) is very problematic. It presents as ‘for information only’ a request for information on the use of AI or machine learning in the context of the actual delivery of the contract. This is information that will either relate to the technical specifications, award criteria or performance clauses (or all of them) and there is no meaningful way in which AI could be used to deliver the contract without this having an impact on the assessment and evaluation of the tender. The question is potentially misleading not only because of the indication that the information would not be scored, but also because it suggests that the use of AI in the delivery of a service or product is within the discretion of the tenderers. In my view, this would only be possible if the technical specifications were rather loosely written in performance terms, which would then require a very thorough description and assessment of how that performance is to be achieved. Moreover, the use of AI would probably require a set of organisational arrangements that should also not go unnoticed or unchecked in the procurement process. Moreover, one of the main challenges may not be in the use of AI in new contracts (were tenderers are likely to highlight it to stress the advantages, or to justify that their tenders are not abnormally low in comparison with delivery through ‘manual’ solutions), but in relation to pre-existing contracts. It also seems that a broader policy, recommendation and audit of the use of generative AI for the delivery of existing contracts and its treatment as a (permissible??) contract modification would have been needed.

Final thought

The AI PPN is an interesting development and will help crystallise many discussions that were somehow hovering in the background. However, a significant rethink is needed and, in my view, much more detailed guidance is needed in relation to the different dimensions of the interaction between AI and procurement. There are important questions that remain unaddressed and, in my view, one of the most pressing ones concerns the balance between general regulation and the use of procurement to regulate AI use. While the UK government remains committed to its ‘pro-innovation’ approach and no general regulation of AI use is put in place, in particular in relation to public sector AI use, procurement will continue to struggle and fail to act as a regulator of the technology.

What's the rush -- some thoughts on the UK's Foundation Model Taskforce and regulation by Twitter

I have been closely following developments on AI regulation in the UK, as part of the background research for the joint submission to the public consultation closing on Wednesday (see here and here). Perhaps surprisingly, the biggest developments do not concern the regulation of AI under the devolved model described in the ‘pro-innovation’ white paper, but its displacement outside existing regulatory regimes—both in terms of funding, and practical power.

Most of the activity and investments are not channelled towards existing resource-strained regulators to support them in their task of issuing guidance on how to deal with AI risks and harms—which stems from the white paper—but in digital industrial policy and R&D projects, including a new major research centre on responsible and trustworthy AI and a Foundation Model Taskforce. A first observation is that this type of investments can be worthwhile, but not at the expense of adequately resourcing regulators facing the tall order of AI regulation.

The UK’s Primer Minister is clearly making a move to use ‘world-leadership in AI safety’ as a major plank of his re-election bid in the coming Fall. I am not only sceptical about this move and its international reception, but also increasingly concerned about a tendency to ‘regulate by Twitter’ and to bullish approaches to regulatory and legal compliance that could well result in squandering a good part of the £100m set aside for the Taskforce.

In this blog, I offer some preliminary thoughts. Comments welcome!

Twitter announcements vs white paper?

During the preparation of our response to the AI public consultation, we had a moment of confusion. The Government published the white paper and an impact assessment supporting it, which primarily amount to doing nothing and maintaining the status quo (aka AI regulatory gap) in the UK. However, there were increasing reports of the Prime Minister’s change of heart after the emergence of a ‘doomer’ narrative peddled by OpenAI’s CEO and others. At some point, the PM sent out a tweet that made us wonder if the Government was changing policy and the abandoning the approach of the white paper even before the end of the public consultation. This was the tweet.

We could not locate any document describing the ‘Safe strategy of AI’, so the only conclusion we could reach is that the ‘strategy’ was the short twitter threat that followed that first tweet.

It was not only surprising that there was no detail, but also that there was no reference to the white paper or to any other official policy document. We were probably not the only ones confused about it (or so we hope!) as it is in general very confusing to have social media messaging pointing out towards regulatory interventions completely outside the existing frameworks—including live public consultations by the government!

It is also confusing to see multiple different documents make reference to different things, and later documents somehow reframing what previous documents mean.

For example, the announcement of the Foundation Model Taskforce came only a few weeks after the publication of the white paper, but there was no mention of it in the white paper itself. Is it possible that the Government had put together a significant funding package and related policy in under a month? Rather than whether it is possible, the question is why do things in this way? And how mature was the thinking behind the Taskforce?

For example, the initial announcement indicated that

The investment will build the UK’s ‘sovereign’ national capabilities so our public services can benefit from the transformational impact of this type of AI. The Taskforce will focus on opportunities to establish the UK as a world leader in foundation models and their applications across the economy, and acting as a global standard bearer for AI safety.

The funding will be invested by the Foundation Model Taskforce in foundation model infrastructure and public service procurement, to create opportunities for domestic innovation. The first pilots targeting public services are expected to launch in the next six months.

Less than two months later, the announcement of the appointment of the Taskforce chair (below) indicated that

… a key focus for the Taskforce in the coming months will be taking forward cutting-edge safety research in the run up to the first global summit on AI safety to be hosted in the UK later this year.

Bringing together expertise from government, industry and academia, the Taskforce will look at the risks surrounding AI. It will carry out research on AI safety and inform broader work on the development of international guardrails, such as shared safety and security standards and infrastructure, that could be put in place to address the risks.

Is it then a Taskforce and pot of money seeking to develop sovereign capabilities and to pilot public sector AI use, or a Taskforce seeking to develop R&D in AI safety? Can it be both? Is there money for both? Also, why steer the £100m Taskforce in this direction and simultaneously spend £31m in funding an academic-led research centre on ethical and trustworthy AI? Is the latter not encompassing issues of AI safety? How will all of these investments and initiatives be coordinated to avoid duplication of effort or replication of regulatory gaps in the disparate consideration of regulatory issues?

Funding and collaboration opportunities announced via Twitter?

Things can get even more confusing or worrying (for me). Yesterday, the Government put out an official announcement and heavy Twitter-based PR to announce the appointment of the Chair of the Foundation Model Taskforce. This announcement raises a few questions. Why on Sunday? What was the rush? Also, what was the process used to select the Chair, if there was one? I have no questions on the profile and suitability of the appointed Chair (have also not looked at them in detail), but I wonder … even if legally compliant to proceed without a formal process with an open call for expressions of interest, is this appropriate? Is the Government stretching the parallelism with the Vaccines Taskforce too far?

Relatedly, there has been no (or I have been unable to locate) official call for expressions of interest from those seeking to get involved with the Taskforce. However, once more, Twitter seems to have been the (pragmatic?) medium used by the newly appointed Chair of the Taskforce. On Sunday itself, this Twitter thread went out:

I find the last bit particularly shocking. A call for expressions of interest in participating in a project capable of spending up to £100m via Google Forms! (At the time of writing), the form is here and its content is as follows:

I find this approach to AI regulation rather concerning and can also see quite a few ways in which the emerging work approach can lead to breaches of procurement law and subsidies controls, or recruitment processes (depending on whether expressions of interest are corporate or individual). I also wonder what is the rush with all of this and what sort of record-keeping will be kept of all this so that it there is adequate accountability of this expenditure. What is the rush?

Or rather, I know that the rush is simply politically-driven and that this is another way in which public funds are at risk for the wrong reasons. But for the entirely arbitrary deadline of the ‘world AI safety summit’ the PM wants to host in the UK in the Fall — preferably ahead of any general election, I would think — it is almost impossible to justify the change of gear between the ‘do nothing’ AI white paper and the ‘rush everything’ approach driving the Taskforce. I hope we will not end up in another set of enquiries and reports, such as those stemming from the PPE procurement scandal or the ventilator challenge, but it is hard to see how this can all be done in a legally compliant manner, and with the serenity. clarity of view and long-term thinking required of regulatory design. Even in the field of AI. Unavoidably, more to follow.

ChatGPT in the Public Sector -- should it be banned?

In ‘ChatGPT in the Public Sector – overhyped or overlooked?’ (24 Apr 2023), the Analysis and Research Team (ART) of the General Secretariat of the Council of the European Union provides a useful and accessible explanation of how ChatGPT works, as well interesting analysis of the risks and pitfalls of rushing to embed generative artificial intelligence (GenAI), and large language models (LLMs) in particular, in the functioning of the public administration.

The analysis stresses the risks stemming from ‘inaccurate, biased, or nonsensical’ GenAI outputs and, in particular, that ‘the key principles of public administration such as accountability, transparency, impartiality, or reliability need to be considered thoroughly in the [GenAI] integration process’.

The paper provides a helpful introduction to how LLMs work and their technical limitations. It then maps potential uses in the public administration, assesses the potential impact of their use on the European principles of public sector administration, and then suggests some measures to mitigate the relevant risks.

This analysis is helpful but, in my view, it is already captured by the presumption that LLMs are here to stay and that what regulators can do is just try to minimise their potential negative impacts—which implies accepting that there will remain unaddressed impacts. By referring to general principles of public administration, rather than eg the right to good administration under the EU Charter of Fundamental Rights, the analysis is also unnecessarily lenient.

I find this type of discourse dangerous and troubling because it facilitates the adoption of digital technologies that cannot meet current legal requirements and guarantees of individual rights. This is clear from the paper itself, although the implications of part of the analysis are not sufficiently explored, in my view.

The paper has a final section where it explicitly recognises that, while some risks might be mitigated by technological advancements, other risks are of a more structural nature and cannot be fully corrected despite best efforts. The paper then lists a very worrying panoply of such structural issues (at 16):

  • ‘This is the case for detecting and removing biases in training data and model outputs. Efforts to sanitize datasets can even worsen biases’.

  • ‘Related to biases is the risk of a perpetuation of the status quo. LLMs mirror the values, habits and attitudes that are present in their training data, which does not leave much space for changing or underrepresented societal views. Relying on LLMs that have been trained with previously produced documents in a public administration severely limits the scope for improvement and innovation and risks leaving the public sector even less flexible than it is already perceived to be’.

  • ‘The ‘black box’ issue, where AI models arrive at conclusions or decisions without revealing the process of how they were reached is also primarily structural’.

  • ‘Regulating new technologies will remain a cat-and-mouse game. Acceleration risk (the emergence of a race to deploy new AI as quickly as possible at the expense of safety standards) is also an area of concern’.

  • ‘Finally […] a major structural risk lies in overreliance, which may be bolstered by rapid technological advances. This could lead to a lack of critical thinking skills needed to adequately assess and oversee the model’s output, especially amongst a younger generation entering a workforce where such models are already being used’.

In my view, beyond the paper’s suggestion that the way forward is to maintain human involvement to monitor the way LLMs (mal)function in the public sector, we should be discussing the imposition of a ban on the adoption of LLMs (and other digital technologies) by the public sector unless it can be positively proven that their deployment will not affect individual rights and more diffuse public interests, and that any residual risks are adequately mitigated.

The current state of affairs is unacceptable in that the lack of regulation allows for a quickly accelerating accumulation of digital deployments that generate risks to social and individual rights and goods. The need to reverse this situation underlies my proposal to permission the adoption of digital technologies by the public sector. Unless we take a robust approach to slowing down and carefully considering the implications of public sector digitalisation, we may be undermining public governance in ways that will be very difficult or impossible to undo. It is not too late, but it may be soon.

Source: https://www.thetimes.co.uk/article/how-we-...