Anthropic just released Claude 2.1, and it vastly outperforms GPT-4 on token capacity

(Image credit: Getty Images)

published 22 November 2023

Contributions from
Rory Bathgate

Anthropic has announced an update to its Claude chatbot in a move that offers far greater performance and intuitive features for enterprise users.

Claude 2.1 delivers a raft of advancements and new capabilities, Anthropic said, including the ability to process up to 200,000 tokens at once for users in its Pro tier.

This, the startup said, is equivalent to over 500 pages of material at once, and offers the ability to analyze entire codebases, lengthy financial statements, or even “long literary works” such as The Iliad.

“By being able to talk to large bodies of content or data, Claude can summarize, perform Q&A, forecast trends, compare and contrast multiple documents, and much more,” the firm said in a blog post.

For context, this means Claude 2.1 offers double the capacity of its previous iteration, Claude 2.0. The update also gives the chatbot greater capacity than the top paid version of OpenAI’s GPT-4 model.

At present, GPT-4 only offers a 32,000 token ceiling for enterprise users.

Anthropic revealed the Claude 2.1 update has been in direct response to user feedback on model capabilities in recent months.

Customer canvassing showed users required “larger context windows” and more accurate outputs when working with long documents and large datasets.

Anthropic: Claude 2.1 has halved ‘hallucinations’

Claude 2.1 will also half the number of hallucinations as before, Anthropic revealed.

The company said this will enable enterprises to build “high-performing AI applications that solve concrete business problems” with a greater degree of trust and reliability than before.

(Image credit: Getty Images)

Anthropic could be the champion AWS and Google need AWS invests $4 billion in Anthropic to improve Bedrock experience Amazon Olympus could be the LLM to rival OpenAI and Google

“We tested Claude 2.1’s honesty by curating a large set of complex, factual questions that probe known weaknesses in current models,” the firm said.

“Using a rubric that distinguishes incorrect claims (“The fifth most populous city in Bolivia is Montero”) from admissions of uncertainty (“I’m not sure what the fifth most populous city in Bolivia is”), Claude 2.1 was significantly more likely to demur rather than provide incorrect information.

Anthropic was keen to emphasize that significant improvements have been made to the model’s comprehension and summarization abilities, particularly for “long, complex documents that demand a high degree of accuracy”.

The new model demonstrated a 30% reduction in incorrect answers, as well as a 3-4x lower rate of incorrectly concluding that documents support specific claims.

Claude 2.1 goes toe-to-toe with ChatGPT

Additional updates to Claude bring the chatbot more in line with what users of ChatGPT can expect. The new beta tool use feature allows users to integrate the chatbot with existing APIs and private knowledge bases, and retrieve information from web sources.

“By popular demand, we’ve also added tool use, a new beta feature that allows Claude to integrate with users' existing processes, products, and APIs,” the firm said.

“This expanded interoperability aims to make Claude more useful across our users’ day-to-day operations.

“Claude can now orchestrate across developer-defined functions or APIs, search over web sources, and retrieve information from private knowledge bases. Users can define a set of tools for Claude to use and specify a request.

RELATED RESOURCE

A whitepaper from MaxContact on how to improve operational efficiency and customer experience

Get tips that will help you invest in the right IT for your business.

DOWNLOAD NOW

“The model will then decide which tool is required to achieve the task and execute an action on their behalf.”

This includes the ability to translate natural language requests into structured API calls, Anthropic added, or use a calculator for complex numerical reasoning.

An updated developer console will also include a test window for developers to experiment with new prompts. This “playground-style experience” will allow devs to customize Claude’s behavior and responses to natural language prompts.

“We’re also introducing system prompts, which allow users to provide custom instructions to Claude in order to improve performance,” Anthropic said.

“System prompts set helpful context that enhances Claude’s ability to take on specified personalities and roles or structure responses in a more customizable, consistent way aligned with user needs.”

Anthropic emerging as a strong ChatGPT competitor

Widening the context windows for Claude 2.1 is in line with Anthropic’s goal of producing a more grounded chatbot experience, with fewer hallucinations than the competition.

Benchmarks for the new model remain to be seen, so direct comparison with GPT-4 in terms of mathematical or logical reasoning cannot be completed. However, this is another strong move to compete with the likes of ChatGPT.

The context window of 32k that those who pay for ChatGPT enjoy was already small in comparison to Claude’s 100k context window, but with 200k Claude 2.1 users will quickly establish the benefits of added prompt context.

Rory Bathgate

Rory Bathgate is Features & Multimedia Editor at ITPro, leading our in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.

Businesses with reams of text such as a codebase might benefit most directly from the change, but the API additions will also be a strong lure for business users.

As ChatGPT adds developer-centric features such as its GPT Store, competition is increasingly coming down to individual technical differences between the two firms. Claude 2.1 is a strong stake on the enterprise space from Anthropic, and IT leaders who have not yet put all their eggs in one basket could be wise to wait and see which of the two models comes out on top.

AWS and Google have championed Anthropic in recent months, with AWS forking out $4 billion for the firm in September and Google having agreed to a total of $1.5 billion further investment on top of its existing commitment of $500 million.

Both companies have AI products of their own, but Anthropic’s value lies in its constitutional approach to AI training.

The firm aims to improve the usability of its LLM by grounding it in extensive rules for engagement with internal and external individuals. The constitution that Claude uses at present was written by Anthropic staff.

Of the two investors, AWS certainly stands to gain a lot through the inclusion of Claude on its generative AI platform Amazon Bedrock. If constitutional AI proves an effective antidote to the problem of hallucination, AWS stands ready to supply the model to any number of firms to meet growing enterprise AI demand.

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.

He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.

For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.

With contributions from