Cisco is jailbreaking AI models so you don’t have to worry about it

LLM/AI jailbreaking concept image showing digitized human brain on mustard yellow background with pixelated parts of the brain flowing away.

(Image credit: Getty Images)

published 13 February 2025

Cisco has launched a new AI Defense security solution it says covers the entire range of potential LLM security threats to help businesses implement generative AI across their organization with confidence.

As firms rush to deploy generative AI tools, be that through internally developed models, customized APIs, or external applications, they significantly increase their attack surface – and Cisco is targeting new demand for holistic LLM security.

Cisco’s new AI Defense platform uses a combination of its existing cloud-based security expertise alongside the latest research to protect organizations against the misuse of AI tools, data leakage, and new LLM jailbreaking techniques.

With respect to tackling the dangers of shadow AI, AI Defense leverages Cisco’s Secure Access product to prevent employees from accessing unsanctioned AI tools and misusing authorized systems, as well as protecting them from accidentally introducing confidential company information into these external models.

But the challenge of securing internal AI systems from external threats is equally pressing for businesses looking to leverage generative AI.

New LLM jailbreaking techniques such as Deceptive Delight, Skeleton Key, Low-Resource Languages, or context window overload have all exposed relatively simple ways in which attackers can manipulate the security guardrails of popular LLMs.

Speaking to ITPro, DJ Sampath, VP of product, AI software and platform at Cisco, said the problem is even worse once companies start to customize the models for their organization, meaning IT teams need to continually reassess the integrity of the model and overlay additional guardrails.

“Each time you want your models to behave a little bit better with the data that you have, you start doing some fine-tuning, you’re changing the state of the model. You don’t know with that fine-tuning if you are knocking off some of the internal guardrails or not,” he explained.

“The only way you know that is by doing model validation over and over again. That’s one of the core pieces of Cisco AI defense, which ensures that even after a fine-tune the model is not changing and, then if it has changed, you have the ability to protect it using guardrails.”

Cisco looks to establish itself as a knowledge leader in LLM security

Sampath said Cisco is able to protect models from the latest jailbreaking methods by leveraging research from a recent acquisition it made last year.

Cisco acquired security startup Robust Intelligence in August 2024, which soon afterwards published a paper on its pioneering approach to LLM jailbreaking called Tree of Attacks with Pruning (TAP).

The TAP method is essentially an automated method for generating jailbreaks for LLMs without requiring access to its internal mechanisms, Sampath explained, noting that once it identifies a weakness it digs down to uncover every possible way in which an attacker could compromise the model using this vulnerability.

He added that the team published another paper on potential techniques that could be used to jailbreak a chain of thought or reasoning model, which he said requires a fundamentally different approach.

RELATED WHITEPAPER

Blend technical expertise with business acumen

Sampath said such is the quality of this unit’s research that they are collaborating with bodies including NIST, MITRE, and OWASP to come up with standards to define how the industry should approach AI security.

He argued this gives Cisco a privileged position to establish itself as a knowledge leader on LLM security, and help the model makers themselves shore up their products.

“We, Cisco, now have a seat at the table with these standards organizations. Not just that we’re influencing it but we’re able to publish some of this bleeding edge research and that research feeds into making these models better.”

Cisco AI Defense is still currently in private beta and will be available in March 2025.

MORE FROM ITPRO

TOPICS

Solomon Klappholz is a former staff writer for ITPro and ChannelPro. He has experience writing about the technologies that facilitate industrial manufacturing, which led to him developing a particular interest in cybersecurity, IT regulation, industrial infrastructure applications, and machine learning.