DeepSeek: Did a little known Chinese startup cause a 'Sputnik moment' for AI?

By John Ruwitch

Published January 28, 2025 at 2:12 PM EST

Andrey Rudakov

Bloomberg via Getty Images

A DeepSeek artificial intelligence logo on a mobile, arranged in Riga, Latvia, on Monday, Jan. 27, 2025.

Did AI just have a "Sputnik moment"?

That's what some investors, after the little known Chinese startup DeepSeek released a chatbot that experts say holds its own against industry leaders, like OpenAI and Google, despite being made with less money and computing power.

Buzz around DeepSeek built into a wave of concern that hammered tech stocks on Monday. It wiped almost $600bn from chipmaker Nvidia's market value.

Not iterative or evolutionary, but pathbreaking

"This is, I think, something that has really shown to some degree how much the U.S. was living in a bubble," said Antonia Hmaidi, a senior analyst at the Mercator Institute for China Studies in Berlin.

"OpenAI and companies like OpenAI had really bet on scaling being sort of infinite, and needing to buy more and more and more chips for performance to improve."

What DeepSeek showed, she said, is that there are different paths.

The company says it used a little more than 2,000 Nvidia H800 GPUs to train the bot, and it did so in a matter of weeks for $5.6 million. Others have reportedly deployed 10,000 or more GPUs, and spent upwards of $100 million or more to get similar results.

Marina Zhang, a scholar with University of Technology Sydney, said DeepSeek has also demonstrated a new kind of innovation for China – not iterative or evolutionary, but pathbreaking.

"They're not really following existing models," she said. "It's basically based on algorithm optimization, using software to break through the constraints of not enough computational power."

Have the U.S. chip export controls failed?

Those constraints were imposed on China by the United States. In 2022, the Biden Administration banned the export of cutting edge microchips to China, arguing that they could be used to enhance the Chinese military.

Zhang said DeepSeek has shown that the chip blockade has not been successful so far. Beijing has been doubling down on a self-reliance drive in tech for several years, pouring money into chip development and other sectors, including AI.

Others argue it's too early to say the chip export controls have failed.

Gregory Allen, director of the Wadhwani AI Center at the Center for Strategic and International Studies in Washington, said DeepSeek could have acquired all its chips before the effect of the controls started to be felt.

A screenshot of a CCTV broadcast Liang Wenfeng, right, founder of DeepSeek, at a conference in Beijing on January 20, 2025.

In a widely reported 2023 interview, DeepSeek founder Liang Wenfeng said the company had stockpiled some 10,000 Nvidia A100 GPUs – a variety that was put on the U.S. export control list. Experts think those may have been deployed in earlier versions of DeepSeek's model.

After the chip blockade started, Nvidia developed a workaround, creating the slightly less powerful H800 GPU, which was legal to sell to China for a time.

"We are currently living through the era of the lagging impact of the Biden administration's misfire in that first batch of AI export controls," said Allen.

DeepSeek had a window in which it was able to buy H800s – before the administration eventually banned the sale of them to China, too.

"DeepSeek has discovered some architectural innovations, some algorithmic innovations that sort of increase the number of IQ points, the amount of intelligence, that a given AI model can get from a given quantity of computational resources," he said.

But AI development requires computing power, and the number of advanced GPUs that DeepSeek, or any other Chinese company, can access is limited by the export controls, he said. That will eventually bite.

Allen says it means the U.S. has an edge: access to advanced chips without restrictions.

"We can copy China's advantages. They cannot copy our advantages. At least not any time soon," he said.

In terms of the hype around DeepSeek developing its near-cutting edge model on the cheap, Allen said the cost was undoubtedly far north of the reported $5.6 million. He likened it to the development of a drug.

"The cost of developing a new medication is not just the cost of the clinical trial that worked," he said. "It's the cost of all the clinical trials that didn't work. And it's the same with this AI model training run. DeepSeek has published how much it cost them for that final successful training run."

It's not known how much the company spent to get to that point, he said.

Hmaidi says DeepSeek is a "very legitimate triumph of Chinese engineering". But she says it's not yet the threat that many are making it out to be.

"I currently don't see how you get a significantly better model with their current pipeline – without more compute," she said.

"Personally, I don't think it's a threat to America's AI prowess at this point."