Three open source large language models you can use today
Enterprises are flocking to open source large language models, many of which have become highly popular - here’s three you might want to try out


Ever since OpenAI fired the starting gun in the generative AI race in November 2022 with the release of ChatGPT, large language models (LLMs) have soared in popularity.
These systems can offer capabilities like text generation, code completion, translation, article summaries, and more, and businesses across the board are finding new use cases for generative AI.
But what if you want to leverage the power of LLMs without the hefty licensing fees or the restrictions of using a closed model?
Well in that case, it might be worth considering an open source large language model - and there’s certainly no shortage of options now.
What is an open source large language model?
An open source LLM differs from a proprietary - or closed source - model in that it's publicly available for organizations to use - although there can be some limitations at times so be wary.
There are a number of benefits to open source LLMs aside from the fact they're publicly available to use right now. Cost savings, in particular, have frequently been touted as a key factor in the decision to use these models by some enterprises.
RE;ATED WHITEPAPER
Similarly, flexibility is another advantage here. Rather than relying on a single provider, open source models allow enterprise users to take advantage of multiple providers.
Get the ITPro daily newsletter
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Maintenance, updates, and development can also be handled by in-house teams, providing you have the internal capabilities to do so.
This article will delve into three of the most notable choices for open source LLMs on the market right now, spanning models launched by major providers such as Meta and Mistral, and a relative newcomer to the scene.
Llama 2 7B (Meta)
Meta’s first foray into a public AI model came with LLaMA (Large Language Model Meta AI) in February 2023, its initial open source 65B parameter LLM.
In July 2023, in partnership with Microsoft, Meta announced the second iteration of its flagship open source model, Llama 2, with three model sizes boasting 7, 13, and 70-billion parameters.
This included both the underlying foundation model as well as fine-tuned versions of each model tailored for conversational agent interfaces billed as Llama 2 Chat.
Each version of Llama 2 was trained using an offline dataset between January and July 2023.
Llama 2 was billed as the first free ChatGPT competitor and is available via application programming interface (API) allowing organizations to quickly integrate the model into their IT infrastructure.
In addition, Llama 2 is also available through AWS or Hugging Face platforms to meet a wider range of use cases in the open source community.
Some experts queried the open source credentials of Meta’s model, however, as applicants with products or services that have over 700 million monthly active users will need to approach Meta directly for permission to use the model.
Open source data science and machine learning platform Hugging Face stated the Llama 2 7B Chat model outperformed open source chat models on most of the benchmarks they tested.
Mixtral 8x7B (Mistral AI)
French artificial intelligence startup Mistral AI has been a hot property in the AI space since it was launched in April 2023 by former researcher at Google Deepmind, Arthur Mensch.
Mistral AI closed a major Series A funding round in December 2023, securing $415 million and pushing the company’s value to nearly $2 billion just eight months after its inception.
The company released Mistral 7B, a 7-billion parameter large language model in September 2023 via Hugging Face.
The model was released under the Apache 2.0 license, a permissive software license that means the software only has very minimal restrictions on how it can be used by third parties.
The release of Mistral 7B was accompanied by some rather bold claims by Mistral AI, stating it could outperform the 13-billion parameter version of Meta’s Llama 2 on all benchmarks.
Mistral 7B uses grouped-query attention (GQA) to achieve faster inferencing functionality, as well as sliding window attention (SWA) to handle longer sequences at lower costs.
In a head-to-head with Meta’s 13-billion parameter Llama 2 model, platform-as-a-service (PaaS) company E2E Cloud said the comparable performance of Mistral 7B relative to larger models from Meta indicates better memory efficiency and improved throughput from the French Model.
Soon after the release of Mistral 7B, the French AI specialists announced their second model, Mixtral 8x7B.
Mixtral 8x7B uses a sparse mixture-of-expert (MoE) architecture composed of eight expert layers, or neural networks.
The advantage of this approach is that MoEs allow models to be pre-trained much faster than their denser feed-forward network (FFN) layer counterparts.
Mixtral 8x7B’s performance is a step up from its predecessor in terms of inference too, with faster inference capabilities compared to a traditional model with the same number of parameters.
This architecture does suffer from a few drawbacks, however, with the eight expert neural networks all needing to be loaded in memory. This means Mixtral 8x7B will require high levels of VRAM to operate.
Fine-tuning models with an MoE approach can also be difficult, as the architecture tends toward overfitting. This means it can struggle to diverge from its training data when presented with new data.
Despite this, there is cause for optimism as a new fine-tuning method known as instruction-tuning could address some of these concerns.
Smaug-72B (Abacus AI)
US startup Abacus.AI released its 72-billion parameter in February 2024, with its impressive benchmark performance prompting much excitement across the machine learning community.
A fine-tuned version of the Qwen-72B model, Smaug-72B was the first and only open source model to post an average score of more than 80 across the major LLM evaluations.
Smaug-72B outperformed a host of the most powerful proprietary models in Hugging Face’s tests for massive multitask language understanding (MMLU), mathematical reasoning, and common sense reasoning.
This included OpenAI’s GPT-3.5 and Mistral’s closed-source Medium model.
As it stands, Smag-72B rules the roost at the top spot of Hugging Face’s open LLM leaderboard.
Interestingly, Smaug-72B outperforms the model it was based on, Qwen-72B. According to Abacus, this is due to a new fine-tuning technique, DPO-Positive, that addresses a number of the weaknesses in previous LLMs.
In a research paper published in February 2024, engineers at Abacus.AI outlined how they designed a new loss function and training procedure that avoids a common failure mode associated with the Direct Preference Optimization (DPO) training method.

Solomon Klappholz is a former staff writer for ITPro and ChannelPro. He has experience writing about the technologies that facilitate industrial manufacturing, which led to him developing a particular interest in cybersecurity, IT regulation, industrial infrastructure applications, and machine learning.
-
Bigger salaries, more burnout: Is the CISO role in crisis?
In-depth CISOs are more stressed than ever before – but why is this and what can be done?
By Kate O'Flaherty Published
-
Cheap cyber crime kits can be bought on the dark web for less than $25
News Research from NordVPN shows phishing kits are now widely available on the dark web and via messaging apps like Telegram, and are often selling for less than $25.
By Emma Woollacott Published
-
Meta executive denies hyping up Llama 4 benchmark scores – but what can users expect from the new models?
News A senior figure at Meta has denied claims that the tech giant boosted performance metrics for its new Llama 4 AI model range following rumors online.
By Nicole Kobie Published
-
OpenAI woos UK government amid consultation on AI training and copyright
News OpenAI is fighting back against the UK government's proposals on how to handle AI training and copyright.
By Emma Woollacott Published
-
DeepSeek and Anthropic have a long way to go to catch ChatGPT: OpenAI's flagship chatbot is still far and away the most popular AI tool in offices globally
News ChatGPT remains the most popular AI tool among office workers globally, research shows, despite a rising number of competitor options available to users.
By Ross Kelly Published
-
‘DIY’ agent platforms are big tech’s latest gambit to drive AI adoption
Analysis The rise of 'DIY' agentic AI development platforms could enable big tech providers to drive AI adoption rates.
By George Fitzmaurice Published
-
OpenAI wants to simplify how developers build AI agents
News OpenAI is releasing a set of tools and APIs designed to simplify agentic AI development in enterprises, the firm has revealed.
By George Fitzmaurice Published
-
Elon Musk’s $97 billion flustered OpenAI – now it’s introducing rules to ward off future interest
News OpenAI is considering restructuring the board of its non-profit arm to ward off unwanted bids after Elon Musk offered $97.4bn for the company.
By Nicole Kobie Published
-
Sam Altman says ‘no thank you’ to Musk's $97bn bid for OpenAI
News OpenAI has rejected a $97.4 billion buyout bid by a consortium led by Elon Musk.
By Nicole Kobie Published
-
DeepSeek flips the script
ITPro Podcast The Chinese startup's efficiency gains could undermine compute demands from the biggest names in tech
By Rory Bathgate Published