The DeepSeek bombshell has been a wakeup call for US tech giants
All it took to flip tech on its head was one plucky Chinese AI startup
Every now and again, life deals you a curveball. In the case of US tech, it was DeepSeek, a Chinese AI startup that caused a meltdown the likes of which we’ve never seen before.
News of DeepSeek has ruled the airwaves over the last couple days following the release of powerful new AI models that appear to represent a paradigm shift in the global AI space.
The firm’s new V3 and R1 AI models rival anything developed by US companies in recent years, all while having been trained on a fraction of the cost at around $5.5 million, according to reports.
Market turbulence in the wake of this announcement has been nothing short of remarkable. Broadcom shares finished down 17% on Monday, while Microsoft and Google slid respectively.
Notably, shares in Nvidia, which has been flying high on the wave of AI hype in recent years, plunged 17% on Monday, wiping $593 billion from the chipmaker’s market value – a dip that represents a record one-day loss for any company.
So is it curtains for Nvidia? Not at all. It’s still outperforming key competitors in the market and big tech will still swoon over its hardware. But it is food for thought given the background here.
Nvidia’s graphics processing units (GPUs) have been the backbone of the generative AI race so far, powering companies the world over to build increasingly large AI models.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
But they come at a staggering cost, and DeepSeek’s success despite contending with older-gen hardware and US sanctions appears to have spooked investors and raised some serious questions.
DeepSeek is believed to have around 10,000 A100 chips at its disposal. These are older Nvidia GPUs that were purchased before US export controls were introduced in an effort to curtail Chinese efforts in the AI race.
The firm is also thought to have trained its V3 model on Nvidia H800 chips, which are designed to comply with said export controls.
Regardless, the scale of its infrastructure has raised eyebrows among US big tech big wigs. The Colossus computing cluster, owned by xAI and located in Tennessee, boasts an array of 100,000 Nvidia H100 GPUs, for example. This makes DeepSeek's toolkit look tiny in comparison.
This further highlights the impressive results DeepSeek has delivered on what is a shoestring budget compared to the mind boggling spending of US-based AI companies - and the government itself. The Trump administration recently announced a $500 billion war chest to ramp up AI infrastructure across the country.
DeepSeek has essentially been working with one arm tied behind its back, and it’s still delivered a killer model.
The macro-level discussions are only one aspect of the conundrum here for US firms, though. On an individual user level, this changes the game. DeepSeek’s R1 model, which is designed specifically to compete in areas such as math, logic problems, and coding capabilities, is also compact enough to run locally on a laptop.
This is now a leading challenger to OpenAI’s o1 “reasoning” model, and draws upon the processing power from a conventional CPU rather than requiring access to GPUs housed in a data center.
It’s safe to say there may have been a few headaches at OpenAI headquarters on Monday. Chief executive Sam Altman described it as an “impressive model” in a social media post, perhaps through gritted teeth.
Pouring money at a problem
There are a number of key takeaways from the DeepSeek bombshell. Industry stakeholders told ITPro this week the story showcases the growing potential of open source AI, but more than anything it puts into context the utterly ludicrous spending on the part of US firms over the last two years.
Tens of billions of dollars have been poured into developing AI models by companies such as OpenAI, which is still grappling with how to truly maximize value from its growing array of models.
Capital expenditure spending among big tech companies has skyrocketed off the back of the generative AI race, with industry big hitters like Microsoft having touted plans to spend $80 billion on AI infrastructure this year alone.
CEO Satya Nadella revealed the company plans to allocate a significant portion of this funding toward building data centers, training AI models, and deploying AI applications at scale for customers.
Elsewhere, Meta CEO Mark Zuckerberg recently announced plans to spend up to $65 billion on AI-related projects in the year ahead, including investment in new data center infrastructure and aggressive hiring for AI talent.
If one were to combine previous spending and future investments, the fact that a relatively unknown startup has caused so much turbulence is a serious cause for concern. And, perhaps, a glimpse at the precarious foundations upon which this entire industry obsession has been built.
Heightened competition
But there’s perhaps a silver lining here. DeepSeek might have dented Silicon Valley’s prestige and ruffled a few feathers, but it has undoubtedly lit a fire under leading players in the space.
Praising the DeepSeek models, Sam Altman said it’s been “invigorating” to have a new competitor on the scene. Other industry execs seem to have a similar outlook on things.
Speaking at the World Economic Forum in Davos last week, Microsoft CEO Satya Nadella appeared to welcome the challenge of a dynamic newcomer in the industry.
“To see the DeepSeek model, it’s super impressive in terms of both how they have really effectively done an open source model that does this inference-time compute, and is supercompute efficient,” he said.
“We should take the developments out of China very, very seriously.”
A variety of impressive models have been released by Chinese companies in recent months, such as Tencent’s Hunyuan tex2video model and Alibaba’s open source AI reasoning model, QwQ. These have largely flown under the radar while eyes were transfixed on US tech activities.
It’s understandable why the industry would predominantly focus on what’s going on in Silicon Valley, it’s been framed as the beating heart of the global tech industry, and rightly so.
This incident may prompt a period of reflection and reinvigorate an industry that’s been hellbent on one-upmanship in the race to spend more money and build bigger models over the last two years.
Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.
He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.
For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.