Deepseek Reports Shockingly Low Training Costs For R1 In New Paper

2 months ago

Follow ZDNET: Add america arsenic a preferred source on Google.

ZDNET's cardinal takeaways

DeepSeek drops really overmuch its R1 exemplary costs to build.
R1's capabilities make investors mobility exorbitant AI spending.
Nvidia declined to opportunity if it ever plans to usage Intel's factories.

DeepSeek, nan Chinese AI laboratory that shook up nan marketplace pinch its awesome open-source R1 model successful January, has yet revealed nan concealed truthful galore were wondering about: really it trained R1 much cheaply than nan companies down other, chiefly American, frontier models.

Also: Worried astir AI's soaring power needs? Avoiding chatbots won't thief - but 3 things could

The institution wrote successful a paper published Wednesday that building R1 only costs them $249,000 -- a ridiculously debased magnitude successful nan high-spending world of AI. For context, DeepSeek said successful an earlier research paper that its V3 model, which is akin to a modular chatbot exemplary family like Claude, costs $5.6 cardinal to train.

That number has been disputed, pinch some experts questioning whether it includes each improvement costs (including infrastructure, R&D, data, and more) aliases singles retired its last training run. Regardless, it's still a fraction of what companies for illustration OpenAI person spent building models (Sam Altman himself has estimated that GPT-4 costs northbound of $100 million).

That quality is besides reflected successful what DeepSeek charges users for R1: $0.14 for a cardinal tokens (about 750,000 words analyzed) -- compared to nan $7.50 OpenAI charges for nan balanced tier.

AI models return tons of resources to build -- betwixt data, GPUs, energy and h2o usage for information centers, unit costs, and more, it tin beryllium an costly task, particularly for much precocious aliases tin models that person bigger training information sets. For Chinese labs, there's nan added roadblock of accessing nan US-made chips due to export bans intended to curb competition. DeepSeek's reported expertise to create successful models by strategically optimizing older chips also gave it a competitory edge. DeepSeek noted successful nan insubstantial that it utilized 512 Nvidia H800 chips, a little powerful, China-specific product, to build R1.

Also: Google claims Gemma 3 reaches 98% of DeepSeek's accuracy - utilizing only 1 GPU

The insubstantial is nan astir important accusation driblet from DeepSeek since January. Earlier this month, reports teased a caller DeepSeek merchandise coming soon.

DeepSeek's imaginable threat

In January, DeepSeek's merchandise rocked nan AI manufacture because of its perceived imaginable to popular nan AI finance bubble. The efficacy of R1 put AI costs successful discourse for investors supporting companies for illustration OpenAI, which is presently trying to raise different $40 billion contempt still not being profitable.

(Disclosure: Ziff Davis, ZDNET's genitor company, revenge an April 2025 suit against OpenAI, alleging it infringed Ziff Davis copyrights successful training and operating its AI systems.)

Also: What Nvidia's stunning $5 cardinal Intel stake intends for endeavor AI and next-gen laptops

Considering AI spending is projected to deed $1.5 trillion by nan extremity of this year, however, that bubble doesn't look to beryllium bursting soon.