AI safety tests inadequate, says Ada Lovelace Institute
Evaluations may be useful but a new report suggests they're not enough on their own and need to be used alongside other governance tools
The safety tests used for AI foundation models are falling short, according to a report from the independent research body the Ada Lovelace Institute.
To evaluate the capabilities, risks, performance, behavior, and social impact of foundation models, researchers at the Institute carried out a literature review and spoke to experts from foundation model providers, third-party evaluation companies, academic labs, and civil society organizations.
And, they found, while evaluations can serve a useful purpose, they're not enough on their own to determine the safety of products and services built using these AI models, and need to be used alongside other governance tools.
Existing methods like 'red teaming' and benchmarking have technical and practical limitations – and could be manipulated or 'gamed' by developers, says the Institute, while evaluations of models in 'ab settings can be useful, but don't provide the full story.
"The field of AI evaluation is new and developing. Current methods can provide useful insights, but our research shows that there are significant challenges and limitations, including a lack of robust standards and practices," said Andrew Strait, associate director at the Ada Lovelace Institute.
"Governing AI is about more than just evaluating AI. Policymakers and regulators are right to see a role for evaluation methods, but should be wary about basing policy decisions on them alone."
The report explores how regulators and policymakers should think about these methods, and provides a series of recommendations. They should develop context-specific evaluations responding to the needs of specific regulators – for example, healthcare-specific evaluations for healthcare regulators.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
They should invest in the science of evaluations to develop more robust evaluations, including understanding how models operate and support an ecosystem of third-party evaluation, including through certification schemes.
And they should establish regulation that incentivizes companies to take evaluation seriously, including giving regulators the powers to scrutinize models and their applications, along with the possibility of blocking releases if they appear unsafe.
"Evaluations are worth investing in, but they should be part of a broader AI governance toolkit," said Strait. "These emerging methods need to be used in tandem with other governance mechanisms, and crucially, supported by comprehensive legislation and robust enforcement."
The EU's AI Act requires developers of these models to evaluate them for 'systemic risks', while the US has both established new AI safety institutes and secured voluntary commitments from AI companies to allow evaluation of their models.
The UK, too, has created an AI safety institute with a 'pro-innovation' approach, criticized by many as having inadequate controls.
However, the new Labour government has said it plans to bring in specific AI regulation and establish a Regulatory Innovation Office for AI, although no details were spelled out in its election manifesto and it remains to be seen how stringent it will be.
Emma Woollacott is a freelance journalist writing for publications including the BBC, Private Eye, Forbes, Raconteur and specialist technology titles.