Started By
Message

re: Markets may be turbulent tomorrow due to contraction in the AI sector

Posted on 1/27/25 at 12:11 pm to
Posted by LSURussian
Member since Feb 2005
133637 posts
Posted on 1/27/25 at 12:11 pm to
You might want to consider selling what you bought back this morning.

I'm about to buy some more which usually triggers a sell-off.
Posted by Deacon
Miami, FL
Member since Dec 2009
250 posts
Posted on 1/27/25 at 12:23 pm to
quote:

Now that is just my opinion, and maybe I am wrong there.. I personally am skeptical if true machine learning is taking place at this point.. In my limited experience (and to be fair, it is very minimal) the coding was the actual brain and therefore, there is a lot of bias that goes into every model that is built.

But once a new discovery is made, especially if it is on an open source platform, that discovery will be integrated into every future model.. and rather quickly, which will evolve into something new, and that is what appears to be happening here..


You're more right than you think you are! Under the hood, LLMs are just really smart auto-complete - predicting the next word based on the words that came before plus the context of previous conversation. This 'intelligence' is achieved through training runs assigning weights to words - and this is where bias comes in. If you type "LSU" in a prompt, I'll heavily weigh "SUCKS" as the next word, thus injecting my bias into the model results.

You're also right - this isn't true 'intelligence' - but we're moving towards an approximation of AGI by sharing the techniques and improving iteration by iteration. Deepseek has combined a few of the techniques (MOE, MTP, caching) and really raised the bar on achieving efficiency.

edit: On bias, you could also hardcode in a response based on a list of words in a prompt. But in an open source model, it would take any half-competent developer 10 min to spot and remove.
This post was edited on 1/27/25 at 12:27 pm
Posted by NC_Tigah
Make Orwell Fiction Again
Member since Sep 2003
135663 posts
Posted on 1/27/25 at 12:26 pm to
quote:

You might want to consider selling what you bought back this morning.

I'm about to buy some more which usually triggers a sell-off.
An inverse indicator
Posted by Tesla
the Laurentian Abyss
Member since Dec 2011
9117 posts
Posted on 1/27/25 at 12:34 pm to
Wait…you’re Jim Kramer?
Posted by LSURussian
Member since Feb 2005
133637 posts
Posted on 1/27/25 at 12:36 pm to
quote:

Wait…you’re Jim Kramer?
Posted by theballguy
Member since Oct 2011
31230 posts
Posted on 1/27/25 at 12:44 pm to
quote:

Jim Cramer


... should have his arse publicly kicked.

Daily.
Posted by John Barron
The Mar-a-Lago Club
Member since Sep 2024
17101 posts
Posted on 1/27/25 at 1:26 pm to
Posted by Decatur
Member since Mar 2007
31759 posts
Posted on 1/27/25 at 1:31 pm to
quote:

Under the hood, LLMs are just really smart auto-complete - predicting the next word based on the words that came before plus the context of previous conversation.


This is why I laugh when I see people acting terrified that AI is going to take over the world.

A trillion-dollar spend for fancy auto-complete? Hmmmm
This post was edited on 1/27/25 at 1:34 pm
Posted by LSURussian
Member since Feb 2005
133637 posts
Posted on 1/27/25 at 1:35 pm to
I don't understand your motives in this thread.

Are you gloating over the possibility that China's AI technology might be better than the USA's AI technology?

What's in it for you, Comrade Snitchching?
Posted by Decatur
Member since Mar 2007
31759 posts
Posted on 1/27/25 at 1:38 pm to
quote:

The key implications of these breakthroughs — and the part you need to understand — only became apparent with V3, which added a new approach to load balancing (further reducing communications overhead) and multi-token prediction in training (further densifying each training step, again reducing overhead): V3 was shockingly cheap to train. DeepSeek claimed the model training took 2,788 thousand H800 GPU hours, which, at a cost of $2/GPU hour, comes out to a mere $5.576 million.

That seems impossibly low.

DeepSeek is clear that these costs are only for the final training run, and exclude all other expenses; from the V3 paper:

Lastly, we emphasize again the economical training costs of DeepSeek-V3, summarized in Table 1, achieved through our optimized co-design of algorithms, frameworks, and hardware. During the pre-training stage, training DeepSeek-V3 on each trillion tokens requires only 180K H800 GPU hours, i.e., 3.7 days on our cluster with 2048 H800 GPUs. Consequently, our pre- training stage is completed in less than two months and costs 2664K GPU hours. Combined with 119K GPU hours for the context length extension and 5K GPU hours for post-training, DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.

So no, you can’t replicate DeepSeek the company for $5.576 million.


LINK
Posted by LSURussian
Member since Feb 2005
133637 posts
Posted on 1/27/25 at 3:06 pm to
quote:

Which has nothing to do with me warning about a crash over a month ago. In that thread people like you said I was fear mongering and didn't know what I was talking about.
Do you usually claim victory for being correct about predicting a market crash when the Dow Jones finishes UP almost 300 points?
Posted by BCreed1
Alabama
Member since Jan 2024
6428 posts
Posted on 1/27/25 at 3:08 pm to
quote:

Are you gloating over the possibility that China's AI technology might be better than the USA's AI technology?



Since right before the election, he started on this path. Not sure what's up.


I do know this, he's 100% wrong on this.
Posted by BCreed1
Alabama
Member since Jan 2024
6428 posts
Posted on 1/27/25 at 3:15 pm to
What's really funny is John and others touting this have 100% no clue to any of it.

1- They do not know how much money was spent.

2- They don"t even know how many chips they used. Was it 10K, 50K 100K?

The reason they don't know ANY of those numbers is because China will not release them. There is a reason for that.
Posted by John Barron
The Mar-a-Lago Club
Member since Sep 2024
17101 posts
Posted on 1/27/25 at 3:30 pm to
quote:

What's really funny is John and others touting this have 100% no clue to any of it.


I am not touting anything. I am posting experts like Perplexity AI CEO who is based in San Francisco opinion on the situation. I don't pretend to know more than they do on this situation like some people in this thread do.


quote:

because China will not release them.


Deepseek did release those numbers and other experts explained to you how they accomplished this. They also made the product open source so it's fully transparent. I suggest you go read this thread if you want to understand how they accomplished this.

Loading Twitter/X Embed...
If tweet fails to load, click here.
Posted by BCreed1
Alabama
Member since Jan 2024
6428 posts
Posted on 1/27/25 at 3:49 pm to
quote:

Deepseek did release those numbers and other experts explained to you how they accomplished this. They also made the product open source so it's fully transparent. I suggest you go read this thread if you want to understand how they accomplished this.



They did NOT release all of the numbers. They released the cost of the final training run. Period.

quote:

Training cost: $5M


Sir that was the cost of the final training run. NOT the over all cost. Here you go:

DeepSeek is clear that these costs are only for the final training run, and exclude all other expenses; from the V3 paper: $5.576 million.


That is straight from their release! That's it. That's the only cost listed.

- How many GPU/Chips did they buy? They don't say.
- How much to build it? They do not say.
- How much to totally train it?


You see, people bit on the words of people like the guy you referred to rather than the actual facts.


From Another article on the "creator" himself:

quote:

Liang, who had previously focused on applying AI to investing, had bought a "stockpile of Nvidia A100 chips," a type of tech that is now banned from export to China. Those chips became the basis of DeepSeek, the MIT publication reported.

LINK





Do you know how many was in that stock pile? No. They are not telling you.

Do you know the cost of that stockpile? no.. because you don't know how many is in the stock pile.

Hey, but you can buy 1... ONE for $25K USED on Ebay!

Posted by John Barron
The Mar-a-Lago Club
Member since Sep 2024
17101 posts
Posted on 1/27/25 at 4:23 pm to
Since you didn't read the Morgan Brown thread I will post the part where it explains how they accomplished more efficiency without the advanced hardware. You keep acting like it's some miracle when it's really not. They show you WHY they don't need the advanced chips compared to OpenAI. Now if they were using the same system as OpenAI and claiming they only spent 6 million I could understand questioning that. But that's not the case, they show you how they made the software more efficient


Loading Twitter/X Embed...
If tweet fails to load, click here.
Posted by Gideon Swashbuckler
Member since Sep 2019
8840 posts
Posted on 1/27/25 at 4:26 pm to
200?

NASDAQ futures closes 21900+ on Friday.
Closed today at 21300.

It was down almost 200 when it opened yesterday.
Posted by BCreed1
Alabama
Member since Jan 2024
6428 posts
Posted on 1/27/25 at 4:50 pm to
quote:

Since you didn't read the Morgan Brown thread I will post the part where it explains how they accomplished more efficiency without the advanced hardware.



LOL! Man are you trying to not grasp this for the sake of arguing? They either used A100s, as suspected, or what they claim, H800.

Which of those do you consider not advanced?

Cost of an A100 on Ebay

Cost of a H800 on ebay

You see, it's not me claiming they used those. THEY claim it. You guy is wrong and purchased hook line and sinker that they did this for 6 million. He failed to read what the company themselves stated. That 6 million was for the final training run.



Do you know they have to cool that too?

Nvidia A100 and H800 require hardware to attach to. EACH of them. ETC

Posted by John Barron
The Mar-a-Lago Club
Member since Sep 2024
17101 posts
Posted on 1/27/25 at 5:17 pm to
quote:

They either used A100s, as suspected, or what they claim, H800.


That's not what's suspected. That's what the people coping keep wishcasting. They used 2,000 H800 vs the 100,000 A100 that OpenAI uses. The bottom line is that DeepSeek uses fewer resources to get results faster. Imagine running an entire AI system on a fraction of the hardware—less cost, same power.

Posted by Deacon
Miami, FL
Member since Dec 2009
250 posts
Posted on 1/27/25 at 5:39 pm to
A pretty good thread that walks through some of the techniques

Loading Twitter/X Embed...
If tweet fails to load, click here.


A lot more detail is in the paper they published with the release -
Deepseek Paper Here

This post was edited on 1/27/25 at 5:44 pm
first pageprev pagePage 11 of 14Next pagelast page

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on X, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookXInstagram