Pakistan Rang
DeepSeek

Chinese artificial intelligence firm DeepSeek has disclosed that it spent $294,000 to train its R1 model — a figure far below the billions often cited by U.S. rivals. The revelation, published in the journal Nature on Wednesday, has once again intensified the global discussion over Beijing’s role in the rapidly escalating AI race.

This marks the first time the Hangzhou-based company has shared cost estimates for developing its flagship reasoning-focused model.

Training Costs Far Below U.S. Counterparts

According to the peer-reviewed article co-authored by founder Liang Wenfeng, DeepSeek trained the R1 model using 512 Nvidia H800 chips. The paper clarified that training lasted 80 hours, leading to the final model’s development.

For comparison, OpenAI’s Sam Altman has previously stated that training foundational AI models had cost his company “much more” than $100 million, though OpenAI has never provided precise figures.

DeepSeek’s declaration therefore highlights the stark contrast in expenses, raising questions about how the company has managed to build powerful systems at a fraction of the cost of American competitors.

Use of Restricted Chips

The U.S. government barred exports of Nvidia’s high-end H100 and A100 chips to China in October 2022. Nvidia later produced the H800 specifically for the Chinese market.

While DeepSeek claimed it used H800 chips for R1, it acknowledged for the first time in supplementary material to Nature that it also possesses A100 GPUs. These were reportedly used in preparatory phases with smaller models before moving to the H800 cluster for full-scale training.

This admission adds weight to reports suggesting DeepSeek had one of the few A100-based supercomputing clusters in China, which helped attract top domestic AI talent.

Distillation Debate

DeepSeek also indirectly addressed earlier allegations that it had “distilled” OpenAI’s models to build its own. Distillation, a method where one AI system learns from another, allows newer models to gain knowledge at lower cost and energy demand.

The company has long defended the practice, arguing it leads to more efficient models that can run on fewer resources, thus making AI more widely accessible.

DeepSeek confirmed that its V3 model training data included web pages containing a “significant number” of OpenAI-generated responses. However, the company stressed this was incidental and not a deliberate attempt to copy rival technology.

Global Repercussions

When DeepSeek unveiled its lower-cost models earlier this year, international markets reacted sharply, with investors dumping tech stocks amid fears that its advances could challenge dominant players like Nvidia.

Now, with the cost revelations, pressure may grow on U.S. companies and regulators, as DeepSeek demonstrates that advanced AI can be developed at far lower cost than assumed.

For Washington, the development could heighten concerns about export controls and the effectiveness of measures designed to slow China’s progress in AI.

Meanwhile, OpenAI, Meta, and other U.S. firms have not yet commented on DeepSeek’s latest disclosures.

Posted by admin
PREVIOUS POST
You May Also Like