Thoughts on the AI Safety Summit from a public sector procurement & use of AI perspective

The UK Government hosted an AI Safety Summit on 1-2 November 2023. A summary of the targeted discussions in a set of 8 roundtables has been published for Day 1, as well as a set of Chair’s statements for Day 2, including considerations around safety testing, the state of the science, and a general summary of discussions. There is also, of course, the (flagship?) Bletchley Declaration, and an introduction to the announced AI Safety Institute (UK AISI).

In this post, I collect some of my thoughts on these outputs of the AI Safety Summit from the perspective of public sector procurement and use of AI.

What was said at the AI safety Summit?

Although the summit was narrowly targeted to discussion of ‘frontier AI’ as particularly advanced AI systems, some of the discussions seem to have involved issues also applicable to less advanced (ie currently in existence) AI systems, and even to non-AI algorithms used by the public sector. As the general summary reflects, ‘There was also substantive discussion of the impact of AI upon wider societal issues, and suggestions that such risks may themselves pose an urgent threat to democracy, human rights, and equality. Participants expressed a range of views as to which risks should be prioritised, noting that addressing frontier risks is not mutually exclusive from addressing existing AI risks and harms.’ Crucially, ‘participants across both days noted a range of current AI risks and harmful impacts, and reiterated the need for them to be tackled with the same energy, cross-disciplinary expertise, and urgency as risks at the frontier.’ Hopefully, then, some of the rather far-fetched discussions of future existential risks can be conducive to taking action on current harms and risks arising from the procurement and use of less advanced systems.

There seemed to be some recognition of the need for more State intervention through regulation, for more regulatory control of standard-setting, and for more attention to be paid to testing and evaluation in the procurement context. For example, the summary of Day 1 discussions indicates that participants agreed that

  • ‘We should invest in basic research, including in governments’ own systems. Public procurement is an opportunity to put into practice how we will evaluate and use technology.’ (Roundtable 4)

  • ‘Company policies are just the baseline and don’t replace the need for governments to set standards and regulate. In particular, standardised benchmarks will be required from trusted external third parties such as the recently announced UK and US AI Safety Institutes.’ (Roundtable 5)

In Day 2, in the context of safety testing, participants agreed that

  • Governments have a responsibility for the overall framework for AI in their countries, including in relation to standard setting. Governments recognise their increasing role for seeing that external evaluations are undertaken for frontier AI models developed within their countries in accordance with their locally applicable legal frameworks, working in collaboration with other governments with aligned interests and relevant capabilities as appropriate, and taking into account, where possible, any established international standards.

  • Governments plan, depending on their circumstances, to invest in public sector capability for testing and other safety research, including advancing the science of evaluating frontier AI models, and to work in partnership with the private sector and other relevant sectors, and other governments as appropriate to this end.

  • Governments will plan to collaborate with one another and promote consistent approaches in this effort, and to share the outcomes of these evaluations, where sharing can be done safely, securely and appropriately, with other countries where the frontier AI model will be deployed.

This could be a basis on which to build an international consensus on the need for more robust and decisive regulation of AI development and testing, as well as a consensus of the sets of considerations and constraints that should be applicable to the procurement and use of AI by the public sector in a way that is compliant with individual (human) rights and social interests. The general summary reflects that ‘Participants welcomed the exchange of ideas and evidence on current and upcoming initiatives, including individual countries’ efforts to utilise AI in public service delivery and elsewhere to improve human wellbeing. They also affirmed the need for the benefits of AI to be made widely available’.

However, some statements seem at first sight contradictory or problematic. While the excerpt above stresses that ‘Governments have a responsibility for the overall framework for AI in their countries, including in relation to standard setting’ (emphasis added), the general summary also stresses that ‘The UK and others recognised the importance of a global digital standards ecosystem which is open, transparent, multi-stakeholder and consensus-based and many standards bodies were noted, including the International Standards Organisation (ISO), International Electrotechnical Commission (IEC), Institute of Electrical and Electronics Engineers (IEEE) and relevant study groups of the International Telecommunication Union (ITU).’ Quite how State responsibility for standard setting fits with industry-led standard setting by such organisations is not only difficult to fathom, but also one of the potentially most problematic issues due to the risk of regulatory tunnelling that delegation of standard setting without a verification or certification mechanism entails.

Moreover, there seemed to be insufficient agreement around crucial issues, which are summarised as ‘a set of more ambitious policies to be returned to in future sessions’, including:

‘1. Multiple participants suggested that existing voluntary commitments would need to be put on a legal or regulatory footing in due course. There was agreement about the need to set common international standards for safety, which should be scientifically measurable.

2. It was suggested that there might be certain circumstances in which governments should apply the principle that models must be proven to be safe before they are deployed, with a presumption that they are otherwise dangerous. This principle could be applied to the current generation of models, or applied when certain capability thresholds were met. This would create certain ‘gates’ that a model had to pass through before it could be deployed.

3. It was suggested that governments should have a role in testing models not just pre- and post-deployment, but earlier in the lifecycle of the model, including early in training runs. There was a discussion about the ability of governments and companies to develop new tools to forecast the capabilities of models before they are trained.

4. The approach to safety should also consider the propensity for accidents and mistakes; governments could set standards relating to how often the machine could be allowed to fail or surprise, measured in an observable and reproducible way.

5. There was a discussion about the need for safety testing not just in the development of models, but in their deployment, since some risks would be contextual. For example, any AI used in critical infrastructure, or equivalent use cases, should have an infallible off-switch.

8. Finally, the participants also discussed the question of equity, and the need to make sure that the broadest spectrum was able to benefit from AI and was shielded from its harms.’

All of these are crucial considerations in relation to the regulation of AI development, (procurement) and use. A lack of consensus around these issues already indicates that there was a generic agreement that some regulation is necessary, but much more limited agreement on what regulation is necessary. This is clearly reflected in what was actually agreed at the summit.

What was agreed at the AI Safety Summit?

Despite all the discussions, little was actually agreed at the AI Safety Summit. The Blethcley Declaration includes a lengthy (but rather uncontroversial?) description of the potential benefits and actual risks of (frontier) AI, some rather generic agreement that ‘something needs to be done’ (eg welcoming ‘the recognition that the protection of human rights, transparency and explainability, fairness, accountability, regulation, safety, appropriate human oversight, ethics, bias mitigation, privacy and data protection needs to be addressed’) and very limited and unspecific commitments.

Indeed, signatories only ‘committed’ to a joint agenda, comprising:

  • ‘identifying AI safety risks of shared concern, building a shared scientific and evidence-based understanding of these risks, and sustaining that understanding as capabilities continue to increase, in the context of a wider global approach to understanding the impact of AI in our societies.

  • building respective risk-based policies across our countries to ensure safety in light of such risks, collaborating as appropriate while recognising our approaches may differ based on national circumstances and applicable legal frameworks. This includes, alongside increased transparency by private actors developing frontier AI capabilities, appropriate evaluation metrics, tools for safety testing, and developing relevant public sector capability and scientific research’ (emphases added).

This does not amount to much that would not happen anyway and, given that one of the UK Government’s objectives for the Summit was to create mechanisms for global collaboration (‘a forward process for international collaboration on frontier AI safety, including how best to support national and international frameworks’), this agreement for each jurisdiction to do things as they see fit in accordance to their own circumstances and collaborate ‘as appropriate’ in view of those seems like a very poor ‘win’.

In reality, there seems to be little coming out of the Summit other than a plan to continue the conversations in 2024. Given what had been said in one of the roundtables (num 5) in relation to the need to put in place adequate safeguards: ‘this work is urgent, and must be put in place in months, not years’; it looks like the ‘to be continued’ approach won’t do or, at least, cannot be claimed to have made much of a difference.

What did the UK Government promise in the AI Summit?

A more specific development announced with the occasion of the Summit (and overshadowed by the earlier US announcement) is that the UK will create the AI Safety Institute (UK AISI), a ‘state-backed organisation focused on advanced AI safety for the public interest. Its mission is to minimise surprise to the UK and humanity from rapid and unexpected advances in AI. It will work towards this by developing the sociotechnical infrastructure needed to understand the risks of advanced AI and enable its governance.’

Crucially, ‘The Institute will focus on the most advanced current AI capabilities and any future developments, aiming to ensure that the UK and the world are not caught off guard by progress at the frontier of AI in a field that is highly uncertain. It will consider open-source systems as well as those deployed with various forms of access controls. Both AI safety and security are in scope’ (emphasis added). This seems to carry forward the extremely narrow focus on ‘frontier AI’ and catastrophic risks that augured a failure of the Summit. It is also in clear contrast with the much more sensible and repeated assertions/consensus in that other types of AI cause very significant risks and that there is ‘a range of current AI risks and harmful impacts, and reiterated the need for them to be tackled with the same energy, cross-disciplinary expertise, and urgency as risks at the frontier.’

Also crucially, UK AISI ‘is not a regulator and will not determine government regulation. It will collaborate with existing organisations within government, academia, civil society, and the private sector to avoid duplication, ensuring that activity is both informing and complementing the UK’s regulatory approach to AI as set out in the AI Regulation white paper’.

According to initial plans, UK AISI ‘will initially perform 3 core functions:

  • Develop and conduct evaluations on advanced AI systems, aiming to characterise safety-relevant capabilities, understand the safety and security of systems, and assess their societal impacts

  • Drive foundational AI safety research, including through launching a range of exploratory research projects and convening external researchers

  • Facilitate information exchange, including by establishing – on a voluntary basis and subject to existing privacy and data regulation – clear information-sharing channels between the Institute and other national and international actors, such as policymakers, international partners, private companies, academia, civil society, and the broader public’

It is also stated that ‘We see a key role for government in providing external evaluations independent of commercial pressures and supporting greater standardisation and promotion of best practice in evaluation more broadly.’ However, the extent to which UK AISI will be able to do that will hinge on issues that are not currently clear (or publicly disclosed), such as the membership of UK AISI or its institutional set up (as ‘state-backed organisation’ does not say much about this).

On that very point, it is somewhat problematic that the UK AISI ‘is an evolution of the UK’s Frontier AI Taskforce. The Frontier AI Taskforce was announced by the Prime Minister and Technology Secretary in April 2023’ (ahem, as ‘Foundation Model Taskforce’—so this is the second rebranding of the same initiative in half a year). As is problematic that UK AISI ‘will continue the Taskforce’s safety research and evaluations. The other core parts of the Taskforce’s mission will remain in [the Department for Science, Innovation and Technology] as policy functions: identifying new uses for AI in the public sector; and strengthening the UK’s capabilities in AI.’ I find the retention of analysis pertaining to public sector AI use within government problematic and a clear indication of the UK’s Government unwillingness to put meaningful mechanisms in place to monitor the process of public sector digitalisation. UK AISI very much sounds like a research institute with a focus on a very narrow set of AI systems and with a remit that will hardly translate into relevant policymaking in areas in dire need of regulation. Finally, it is also very problematic that funding is not locked: ‘The Institute will be backed with a continuation of the Taskforce’s 2024 to 2025 funding as an annual amount for the rest of this decade, subject to it demonstrating the continued requirement for that level of public funds.’ In reality, this means that the Institute’s continued existence will depend on the Government’s satisfaction with its work and the direction of travel of its activities and outputs. This is not at all conducive to independence, in my view.

So, all in all, there is very little new in the announcement of the creation of the UK AISI and, while there is a (theoretical) possibility for the Institute to make a positive contribution to regulating AI procurement and use (in the public sector), this seems extremely remote and potentially undermined by the Institute’s institutional set up. This is probably in stark contrast with the US approach the UK is trying to mimic (though more on the US approach in a future entry).