What do you really need for your AI project?
From GPUs and NPUs to the right amount of server storage, there are many hardware considerations for training AI in-house

Generative AI and its many intriguing use cases have captured our imagination like nothing before it. Around the world, businesses are scrambling to adopt and deploy the technology, though much of the discussion around generative AI has focused on the software the user interacts with, rather than the hardware that’s needed to power such large computations.
Like most forms of artificial intelligence, generative AI uses large language models (LLMs), which – as the name suggests – contains massive amounts of data. To run such models in-house requires substantial computational resources, with emphasis on power and storage.
The dramatic advancement of processor technology over the last few years has largely been driven by the demands of AI. As such, these small pieces of silicon are the biggest factors in determining how fast you can train and run your machine learning models and how expensive it will be in the short and long term.
Similarly, high-performance servers and storage systems are also needed for the data that will power your language models. Said data will need to be kept somewhere locally or easily accessible. While often organizations will opt for a third-party service to handle these considerations, there are some for which in-house AI training is preferable.
Hardware considerations
While the software side of generative AI takes most of the plaudits, its success is uniquely tied to advancements in hardware. Upgrades to chipsets and increased storage capabilities have helped to improve speed, capacity, and connectivity, which has ultimately led to advances in both the training and inference of generative AI models.
Running generative AI effectively requires machine learning algorithms and the creation of new images, audio, and text, which is a considerable amount of data. Higher data transmission rates and lower power consumption that facilitate more efficient processing at low latencies are also key elements for generative AI.
Similarly, high-bandwidth memory (HBM) has helped to improve AI accelerators, computational speeds, and energy efficiency with gradual increases in memory bandwidth.
Get the ITPro daily newsletter
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Storage capacity will be a major concern for businesses that want to train their own AI models in-house. As some organizations have found out in recent years with standard cloud storage, using a third-party provider can prove expensive in the long run, especially if large amounts of data are involved. The same can be the case with third-party AI training services and largely for the same reason: A rapidly growing set of data that was already big when it was first created will cost proportionately more and more to host. Another parallel is egress fees; once the data is in the third party provider’s infrastructure it can prove unexpectedly difficult and pricey to get it back out again.
For some organizations, therefore, it makes more financial sense in the long term to invest in building their own infrastructure to create LLMs and train generative AI applications in house.
CPUs, GPUs, and NPUs
While storage is clearly a top consideration when deciding to build dedicated AI infrastructure, the importance of chips can’t be overlooked
There are a few different kinds of processors being used across industries for generative AI. These include central processing units (CPUs), AI accelerator processor units (APUs), and network processors (NPUs). All three perform different kinds of computation that enable generative AI tasks and are often fitted into specialized processing units that are essential for the types of computational speeds required for generative AI.
Within this heterogeneous architecture, the components accommodate one another to provide computational performance for an AI application. The CPU is typically used for general-purpose tasks, while an AI accelerator can boost tensor operations which speeds up neural network training and inference. The network processor binds the three components together by ferrying data between them.
RELATED WHITEPAPER
However, the rapid expansion of generative AI has seen a massive demand for high-performance graphic processing units (GPUs). How GPUs parallelize matrix operations is what powers generative AI; the architecture allows LLMs to process vast amounts of data all at once, speeding up training times. Such increased computational power efficiently trains complex language models with billions of parameters.
The AMD Instinct MI300A accelerator is a great example of a modern high-powered APU. Largely seen as an industry-leading chip, this data center APU is a perfect combination of the Instinct series and AMD’s EPYC processors that enable the convergence of AI with high-performance computing.
The next generation of processors
A total of 13 chiplets are used inside the MI300A accelerator to create a chip with twenty-four Zen 4 CPU cores, each fused with a CDNA 3 graphic engine and eight stacks of HBM3, totaling 128GB of memory. This kind of architecture is called 3D stacking, which is seen as an essential route in getting more transistors into the same space while continuing to drive Moore’s Law forward.
With its Instinct MI300 family of GPUs AMD has potentially captured an advantage, particularly in the generative AI market, with its new approach to accelerator architecture. The MI300X, for example, will appeal to businesses looking to train generative AI models in-house as it has such impressive memory capacity compared to the MI300A – 192GB of higher bandwidth memory (HBM).
The CDN architecture on the MI300X is designed to handle various data formats and spares metrics that enable AI workloads. That ability to process vast datasets is a key factor for advancing AI and machine learning.
Alongside the MI300 Series hardware, businesses should also invest in AMD’s ROCm software, which is based on open-source software and is designed for GPU computation. It’s the software primarily used to customize GPU software for development, testing, and deployment and is specifically powered by AMD’s heterogeneous computing interface for portability (HIP).
ROCm can be used in several domains, including general-purpose graphics processors (GPGPU), high-performance computing (HPC), and also heterogeneous computing. This is in addition to its many programming models, like GPU-kernel-based programming and OpenCL which can be used to write programs that run on heterogeneous platforms.
While there are other considerations when it comes to moving to in-house AI training that go beyond hardware, it’s a good place to start. There is no time like now to begin an AI project and there is no APU like the AMD MI300A to start it with.
ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it's cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.
For regular updates delivered to your inbox and social feeds, be sure to sign up to our daily newsletter and follow on us LinkedIn and Twitter.
-
Bigger salaries, more burnout: Is the CISO role in crisis?
In-depth CISOs are more stressed than ever before – but why is this and what can be done?
By Kate O'Flaherty Published
-
Cheap cyber crime kits can be bought on the dark web for less than $25
News Research from NordVPN shows phishing kits are now widely available on the dark web and via messaging apps like Telegram, and are often selling for less than $25.
By Emma Woollacott Published
-
What is exascale computing? Exploring the next step in supercomputers
60 years after the birth of the first supercomputers, we are entering a new era
By Jane McCallion Published
-
AMD has put in the groundwork for a major AI push while the tech industry has fawned over Nvidia
Analysis The processor giant will be keen to use its AMD Advancing AI Event in San Francisco to capitalize on recent successes
By Ross Kelly Published
-
Empowering enterprises with AI: Entering the era of choice
whitepaper How High Performance Computing (HPC) is making great ideas greater, bringing out their boundless potential, and driving innovation forward
By ITPro Last updated
-
AMD’s acquisition spree continues with $665 million deal for Silo AI
News The deal will enable AMD to bolster its portfolio of end-to-end AI solutions and drive its ‘open standards’ approach to the technology
By Ross Kelly Published
-
AMD retains its position as the partner of choice for the world’s fastest and most efficient HPC deployments
Supported content AMD EPYC processors and AMD Instinct accelerators have been used to power a host of new supercomputers globally over the last year
By ITPro Published
-
AMD strikes deal to power Microsoft Azure OpenAI service workloads with Instinct MI300X accelerators
Supported content Some of Microsoft’s most critical services, including Teams and Azure, will be underpinned by AMD technology
By ITPro Published
-
AMD reveals MI300 series accelerators for generative AI
Supported editorial MI300X is the star of the new family of processors with AMD ‘laser-focused’ on data center deployment
By ITPro Published
-
AMD and Microsoft cement relationship with cloud collaborations
The products offer Azure customers the possibility of running high intensity or AI workloads on powerful infrastructure, without having to house or maintain it themselves
By ITPro Published