U.S’ AI Hardware Restrictions on China Have Backfired

“The risk of an asteroid hitting the Earth or a pandemic also exists. But the risk of China destroying our system is significantly larger in my opinion,” VC Vinod Khosla said. The post U.S’ AI Hardware Restrictions on China Have Backfired appeared first on Analytics India Magazine.

U.S’ AI Hardware Restrictions on China Have Backfired

ARE YOU TIRED OF LOW SALES TODAY?

Connect to more customers on doacWeb

Post your business here..... from NGN1,000

WhatsApp: 09031633831

ARE YOU TIRED OF LOW SALES TODAY?

Connect to more customers on doacWeb

Post your business here..... from NGN1,000

WhatsApp: 09031633831

ARE YOU TIRED OF LOW SALES TODAY?

Connect to more customers on doacWeb

Post your business here..... from NGN1,000

WhatsApp: 09031633831

Chinese research firm DeepSeek on Thursday unveiled DeepSeek-V3, the strongest open-source model out there. While Chinese models have caught up with frontier models from the West over the last few months, DeepSeek paints a different picture this time. 

The company was able to train the model with just around $5.5 million, a cost that is significantly lower than many other models in this segment. 

Over the last few years, the United States has been imposing several embargos and export sanctions on NVIDIA GPUs to China. 

Given DeepSeek-V3’s performance results and cost efficiency, these sanctions seem to have had a counterproductive effect. This has pushed Chinese engineers to focus on building models with unprecedented efficiency, considering the few resources that they have. 

Deeply Seeking Efficiency 

DeepSeek-V3 is a large, 671 billion parameter model trained on 2.788 million NVIDIA H800 GPU hours. The model outperforms Meta’s 405 billion parameter Llama 3.1 in most benchmarks and even closed source Claude 3.5 Sonnet and GPT-4o in several tests. 

This cost DeepSeek a total of $5.576 million, which includes pre-training, context extension, and post-training. 

Earlier this year, research institute EpochAI released a technical paper which revealed the staggering costs of training frontier models. “We find that the most expensive publicly-announced training runs to date are OpenAI’s GPT-4 at $40 million and Google’s Gemini Ultra at $30 million,” read the report. 

DeepSeek is also an incredibly cost-effective model for API usage. 

It is currently priced at $0.14 per million tokens during input and $0.28 per million tokens for output until February 8, 2025. Eventually, it will cost $0.27 per million tokens during input and $1.10 per million tokens during output. OpenAI’s GPT-4o costs $2.50 per million tokens for input and $10.00 per million tokens for output. 

chart visualization

“To run DeepSeek v3 24/7 at 60 tokens per second (5x human reading speed) is $2 a day,” said Emad Mostaque, founder of Stability AI, who compared it to being as cheap as a cup of latte. 

DeepSeek-V3’s technical paper reveals all the magic inside the model’s architecture. Techniques like FP8 precision training, optimisation in the infrastructure algorithms, and the training framework are what make the model achieve it all, along with the fact that it is open source. The model is available on the web for free and also supports real-time information through web search. 

In a recent interview, Elon Musk, CEO of xAI, said that training the Grok 2 model took about 20,000 NVIDIA H100 GPUs. He added that training the Grok 3 models will require 1 lakh NVIDIA H100 GPUs. 

Meta also revealed that it is using more than 1 lakh NVIDIA H100 GPUs to train the upcoming Llama 4 models. “[This is] bigger than anything that I’ve seen reported for what others are doing,” said Meta chief Mark Zuckerberg in the company’s earnings report released in October. 

In contrast, DeepSeek-V3 was trained on 2,048 NVIDIA H800 GPUs. Owing to US President Joe Biden’s administration restrictions, the NVIDIA H800 is a GPU designed to comply with export regulations in the Chinese market with a data transfer rate slashed by 50%. 

The H100 offers a transfer rate of 600 gigabytes per second, compared to the H800’s 300 gigabytes per second. 

This does raise concerns about whether the frontier model makers are underutilising compute. “For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being brought up today are more around 100K GPUs,” Andrej Karpathy, former OpenAI researcher, said in a post on X

“You have to ensure that you’re not wasteful with what you have, and this (DeepSeek-V3) looks like a nice demonstration that there’s still a lot to get through with both data and algorithms,” he added. 

‘Regulators Never Considered Second Order Effects’

Soon after, the US also banned the export of NVIDIA’s H800 to China and prevented the company from selling chips even with a reduced transfer rate. While there is no official disclosure of the number of H800 GPUs exported to China, an investigation suggested that there is an underground network of around 70 sellers who claim to receive dozens of GPUs every month. 

Another report also revealed that NVIDIA’s chips are reaching China as a part of server products from Dell, Supermicro, etc. Recently, the US Department of Commerce asked NVIDIA to investigate how its produce has reached China.

While it isn’t clear whether DeepSeek purchased NVIDIA’s H800s while it was being legally exported, their work is everything that the US government did not wish to see. The difficulty of purchasing powerful hardware has led China to intensely prioritise its focus on optimisations at the model architecture level. 

Amjad Masad, CEO of AI-enabled coding platform Replit, said on X, “The Chinese [have] innovated a way to train large models for cheap. Regulators never consider second-order effects.”

Most of the techniques outlined in the paper indicate that the researchers at DeepSeek mostly focus on problems that LLMs face under resource constraints. Bojan Tunguz, a former engineer at NVIDIA, said on X, “All the export bans on high-end semiconductors might have actually been counterproductive in the ‘worst’ way imaginable.”

Several social media users also speculate what would occur if the restrictions weren’t present in the first place. If not for the chip embargo, China would have built AGI in months, said a user on X

DeepSeek doesn’t wish to stop here, either. “We will consistently study and refine our model architectures, aiming to further improve both the training and inference efficiency, striving to approach efficient support for infinite context length,” the researchers said in the report.

“Additionally, we will try to break through the architectural limitations of a transformer, thereby pushing the boundaries of its modelling capabilities,” they added. 

That said, fears of China using the best of technology stems from concerns about how China might use it for military purposes. The US government states that China will use “advanced computing chips” to produce weapons of mass destruction. 

“The PRC has poured resources into developing supercomputing capabilities and seeks to become a world leader in artificial intelligence by 2030. It is using these capabilities to monitor, track, and surveil its own citizens and fuel its military modernisation,” said Thea D Rozman Kendler, assistant secretary of commerce for export administration. 

This sentiment is also echoed by leaders in the private sector. Vinod Khosla, a venture capitalist who has actively backed OpenAI, said in an essay titled ‘AI: Utopia or Dystopia’, said, “China is the fastest way [to making] the doomers’ nightmares come true.”

“We may have to worry about sentient AI destroying humanity, but the risk of an asteroid hitting the Earth or a pandemic also exists. But the risk of China destroying our system is significantly larger in my opinion,” Khosla said, referring to China as a “bad actor”.

While DeepSeek’s recent development would give the US government sleepless nights, the reality may not be as fearsome as it is made out to be. 

The US government may well be masquerading China’s economic threat as a civilian threat, as elaborated in a report published by the Carnegie Endowment for International Peace titled ‘US-China Relations for the 2030s: Toward a Realistic Scenario for Coexistence’.

“It [US] is uncomfortable with the possibility of a true peer competitor rising and views this as a threat. China, which has been rising for decades, reached some key landmarks recently; it became the world’s top manufacturing and trading nation, as well as the world’s second-most capable military power,” read the report. 

The post U.S’ AI Hardware Restrictions on China Have Backfired appeared first on Analytics India Magazine.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow