Posted on 01/27/2025 5:44:32 PM PST by SeekAndFind
Nvidia called DeepSeek’s R1 model “an excellent AI advancement,” despite the Chinese startup’s emergence causing the chip maker’s stock price to plunge 17% on Monday.
“DeepSeek is an excellent AI advancement and a perfect example of Test Time Scaling,” an Nvidia spokesperson told CNBC on Monday. “DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely-available models and compute that is fully export control compliant.”
The comments come after DeepSeek last week released R1, which is an open-source reasoning model that reportedly outperformed the best models from U.S. companies such as OpenAI. R1′s self-reported training cost was less than $6 million, which is a fraction of the billions that Silicon Valley companies are spending to build their artificial-intelligence models.
Nvidia’s statement indicates that it sees DeepSeek’s breakthrough as creating more work for the American chip maker’s graphics processing units, or GPUs.
“Inference requires significant numbers of NVIDIA GPUs and high-performance networking,” the spokesperson added. “We now have three scaling laws: pre-training and post-training, which continue, and new test-time scaling.”
Nvidia also said that the GPUs that DeepSeek used were fully export compliant. That counters Scale AI CEO Alexandr Wang’s comments on CNBC last week that he believed DeepSeek used Nvidia GPUs models which are banned in mainland China. DeepSeek says it used special versions of Nvidia’s GPUs intended for the Chinese market.
Analysts are now asking if multi-billion dollar capital investments from companies like Microsoft , Google and Meta for Nvidia-based AI infrastructure are being wasted when the same results can be achieved more cheaply.
Earlier this month, Microsoft said it is spending $80 billion on AI infrastructure in 2025 alone while Meta CEO Mark Zuckerberg last week said the social media company planned to invest between $60 to $65 billion in capital expenditures
(Excerpt) Read more at cnbc.com ...
I expect that there will be a lot more focus on optimizing the model building algorithms, which will be great.
Stargate’s $500 billion will produce $25 trillion or more in AI processing.
About to lose $1 TRILLION in value due to it.
Is this the economic COVID attack? Waited for Trump to get in office to release it.
Will Stargate get renamed to Skynet?
If one believes they didn’t used smuggled H500’s...
I don’t know what the surprise is? I can run AI models before this on my server that competed favorably with Open AI models after training.
Surprising small trained models could do almost anything. I am actually going to download and run their 32GB model to have a local reasoning. But I already fine-tune models for art, writing, and coding. The massive expenditure in 100,000 GPU farms never really made any sense.
The only reason for these massive GPU farms is to create Super Intelligence and that is the scary part. I seen Terminator.
bbb
Actually with what I have read on all of the this, there doesn’t need to be a $500 Billion Investment, it can be done for a fraction of that amount.
That Chinese startup only spent about $6 Million to develop that AI Platform, makes whatever going here already obsolete.
IMO...as this is just an efficiency increase via software, all it does is make nVidia processors more useful for even more advanced applications.
Jevons Paradox, named after English economist William Stanley Jevons, states that when the efficiency of resource use improves, it often leads to increased consumption of that resource rather than decreased use. More use cases mean more people using AI for everything.
DeepSeek R1 does well on the benchmarks because the Chinese could use OpenAI’s model to train it. The NVIDIA hardware chips are export-controlled but not the finished frontier models like OpenAI. Why buy the cow when you can get the milk for free?
One day we’re all going to be sitting in the dark with our bank accounts drained and personal records destroyed wondering why we were so gung ho on this technology.
My childhood years in the 70s and teen years ni the 80s were the best of my life.
Iwasn’t distracted by 200 gadgets...and government didn’t know every move i made
I wonder how this would run on my game box.. 4090,24 core Intel CPU.
Dave Plummer says he can run it on his box with a threadripper and it works fine..
https://www.youtube.com/watch?v=r3TpcHebtxM
Nvidia’s defensiveness on the export controls makes me believe even more firmly that China easily imported all the banned graphics cards they needed to train deepseek. Wang is correct. They used H100s.
Jevon’s paradox.
Now that it’s cheaper, demand will expand.
This whole “run locally” thing confuses me. Is that saying that DeepSeek can operate without making calls outside the device? That it will run effectively on a machine that has no connection to the outside world?
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.