My Forums

Trending

Quick Links:

Tiger Rant

•LSU Recruiting

•O-T Lounge

•Politics

•More Sports

•Coaching Changes

•SEC Rant

•Saints

•Home/Garden

•Outdoors

•Movie/TV

•Food/Drink

•Money

•Tech

•Music

•Travel

•TD Help

...

Political Talk

Return•Bottom

« Prev Page Next Page »

Page First 9 10 11 12 13 14

Page 11 of 14

Started By

Message

re: Markets may be turbulent tomorrow due to contraction in the AI sector

Posted on 1/27/25 at 12:11 pm to NC_Tigah

Posted by LSURussian

Member since Feb 2005

133637 posts

Posted on 1/27/25 at 12:11 pm to NC_Tigah

You might want to consider selling what you bought back this morning.

I'm about to buy some more which usually triggers a sell-off.

3 ...

Report Post

Posted by Deacon

Miami, FL

Member since Dec 2009

250 posts

Posted on 1/27/25 at 12:23 pm to LSURussian

quote:
Now that is just my opinion, and maybe I am wrong there.. I personally am skeptical if true machine learning is taking place at this point.. In my limited experience (and to be fair, it is very minimal) the coding was the actual brain and therefore, there is a lot of bias that goes into every model that is built.

But once a new discovery is made, especially if it is on an open source platform, that discovery will be integrated into every future model.. and rather quickly, which will evolve into something new, and that is what appears to be happening here..

You're more right than you think you are! Under the hood, LLMs are just really smart auto-complete - predicting the next word based on the words that came before plus the context of previous conversation. This 'intelligence' is achieved through training runs assigning weights to words - and this is where bias comes in. If you type "LSU" in a prompt, I'll heavily weigh "SUCKS" as the next word, thus injecting my bias into the model results.

You're also right - this isn't true 'intelligence' - but we're moving towards an approximation of AGI by sharing the techniques and improving iteration by iteration. Deepseek has combined a few of the techniques (MOE, MTP, caching) and really raised the bar on achieving efficiency.

edit: On bias, you could also hardcode in a response based on a list of words in a prompt. But in an open source model, it would take any half-competent developer 10 min to spot and remove.

This post was edited on 1/27/25 at 12:27 pm

1 ...

Report Post

Posted by NC_Tigah

Make Orwell Fiction Again

Member since Sep 2003

135663 posts

Posted on 1/27/25 at 12:26 pm to LSURussian

quote:
You might want to consider selling what you bought back this morning.

I'm about to buy some more which usually triggers a sell-off.

An inverse indicator

0 ...

Report Post

Posted by Tesla

the Laurentian Abyss

Member since Dec 2011

9117 posts

Posted on 1/27/25 at 12:34 pm to LSURussian

Wait…you’re Jim Kramer?

1 ...

Report Post

Posted by LSURussian

Member since Feb 2005

133637 posts

Posted on 1/27/25 at 12:36 pm to Tesla

quote:
Wait…you’re Jim Kramer?

0 ...

Report Post

Posted by theballguy

Member since Oct 2011

31230 posts

Posted on 1/27/25 at 12:44 pm to John Barron

quote:
Jim Cramer

... should have his arse publicly kicked.

Daily.

1 ...

Report Post

Posted by John Barron

The Mar-a-Lago Club

Member since Sep 2024

17101 posts

Posted on 1/27/25 at 1:26 pm to theballguy

Loading Twitter/X Embed...
If tweet fails to load, click here.

1 ...

Report Post

Posted by Decatur

Member since Mar 2007

31759 posts

Posted on 1/27/25 at 1:31 pm to Deacon

quote:
Under the hood, LLMs are just really smart auto-complete - predicting the next word based on the words that came before plus the context of previous conversation.

This is why I laugh when I see people acting terrified that AI is going to take over the world.

A trillion-dollar spend for fancy auto-complete? Hmmmm

This post was edited on 1/27/25 at 1:34 pm

0 ...

Report Post

Posted by LSURussian

Member since Feb 2005

133637 posts

Posted on 1/27/25 at 1:35 pm to John Barron

I don't understand your motives in this thread.

Are you gloating over the possibility that China's AI technology might be better than the USA's AI technology?

What's in it for you, Comrade Snitchching?

1 ...

Report Post

Posted by Decatur

Member since Mar 2007

31759 posts

Posted on 1/27/25 at 1:38 pm to NC_Tigah

quote:
The key implications of these breakthroughs — and the part you need to understand — only became apparent with V3, which added a new approach to load balancing (further reducing communications overhead) and multi-token prediction in training (further densifying each training step, again reducing overhead): V3 was shockingly cheap to train. DeepSeek claimed the model training took 2,788 thousand H800 GPU hours, which, at a cost of $2/GPU hour, comes out to a mere $5.576 million.

That seems impossibly low.

DeepSeek is clear that these costs are only for the final training run, and exclude all other expenses; from the V3 paper:

Lastly, we emphasize again the economical training costs of DeepSeek-V3, summarized in Table 1, achieved through our optimized co-design of algorithms, frameworks, and hardware. During the pre-training stage, training DeepSeek-V3 on each trillion tokens requires only 180K H800 GPU hours, i.e., 3.7 days on our cluster with 2048 H800 GPUs. Consequently, our pre- training stage is completed in less than two months and costs 2664K GPU hours. Combined with 119K GPU hours for the context length extension and 5K GPU hours for post-training, DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.

So no, you can’t replicate DeepSeek the company for $5.576 million.

LINK

0 ...

Report Post

Posted by LSURussian

Member since Feb 2005

133637 posts

Posted on 1/27/25 at 3:06 pm to John Barron

quote:
Which has nothing to do with me warning about a crash over a month ago. In that thread people like you said I was fear mongering and didn't know what I was talking about.

Do you usually claim victory for being correct about predicting a market crash when the Dow Jones finishes UP almost 300 points?

0 ...

Report Post

Posted by BCreed1

Alabama

Member since Jan 2024

6428 posts

Posted on 1/27/25 at 3:08 pm to LSURussian

quote:
Are you gloating over the possibility that China's AI technology might be better than the USA's AI technology?

Since right before the election, he started on this path. Not sure what's up.

I do know this, he's 100% wrong on this.

0 ...

Report Post

Posted by BCreed1

Alabama

Member since Jan 2024

6428 posts

Posted on 1/27/25 at 3:15 pm to LSUbest

What's really funny is John and others touting this have 100% no clue to any of it.

1- They do not know how much money was spent.

2- They don"t even know how many chips they used. Was it 10K, 50K 100K?

The reason they don't know ANY of those numbers is because China will not release them. There is a reason for that.

1 ...

Report Post

Posted by John Barron

The Mar-a-Lago Club

Member since Sep 2024

17101 posts

Posted on 1/27/25 at 3:30 pm to BCreed1

quote:
What's really funny is John and others touting this have 100% no clue to any of it.

I am not touting anything. I am posting experts like Perplexity AI CEO who is based in San Francisco opinion on the situation. I don't pretend to know more than they do on this situation like some people in this thread do.

quote:
because China will not release them.

Deepseek did release those numbers and other experts explained to you how they accomplished this. They also made the product open source so it's fully transparent. I suggest you go read this thread if you want to understand how they accomplished this.

Loading Twitter/X Embed...
If tweet fails to load, click here.

1 ...

Report Post

Posted by BCreed1

Alabama

Member since Jan 2024

6428 posts

Posted on 1/27/25 at 3:49 pm to John Barron

quote:
Deepseek did release those numbers and other experts explained to you how they accomplished this. They also made the product open source so it's fully transparent. I suggest you go read this thread if you want to understand how they accomplished this.

They did NOT release all of the numbers. They released the cost of the final training run. Period.

quote:
Training cost: $5M

Sir that was the cost of the final training run. NOT the over all cost. Here you go:

DeepSeek is clear that these costs are only for the final training run, and exclude all other expenses; from the V3 paper: $5.576 million.

That is straight from their release! That's it. That's the only cost listed.

- How many GPU/Chips did they buy? They don't say.
- How much to build it? They do not say.
- How much to totally train it?

You see, people bit on the words of people like the guy you referred to rather than the actual facts.

From Another article on the "creator" himself:

quote:
Liang, who had previously focused on applying AI to investing, had bought a "stockpile of Nvidia A100 chips," a type of tech that is now banned from export to China. Those chips became the basis of DeepSeek, the MIT publication reported.

LINK

Do you know how many was in that stock pile? No. They are not telling you.

Do you know the cost of that stockpile? no.. because you don't know how many is in the stock pile.

Hey, but you can buy 1... ONE for $25K USED on Ebay!

1 ...

Report Post

Posted by John Barron

The Mar-a-Lago Club

Member since Sep 2024

17101 posts

Posted on 1/27/25 at 4:23 pm to BCreed1

Since you didn't read the Morgan Brown thread I will post the part where it explains how they accomplished more efficiency without the advanced hardware. You keep acting like it's some miracle when it's really not. They show you WHY they don't need the advanced chips compared to OpenAI. Now if they were using the same system as OpenAI and claiming they only spent 6 million I could understand questioning that. But that's not the case, they show you how they made the software more efficient

Loading Twitter/X Embed...
If tweet fails to load, click here.

1 ...

Report Post

Posted by Gideon Swashbuckler

Member since Sep 2019

8840 posts

Posted on 1/27/25 at 4:26 pm to The Egg

200?

NASDAQ futures closes 21900+ on Friday.
Closed today at 21300.

It was down almost 200 when it opened yesterday.

0 ...

Report Post

Posted by BCreed1

Alabama

Member since Jan 2024

6428 posts

Posted on 1/27/25 at 4:50 pm to John Barron

quote:
Since you didn't read the Morgan Brown thread I will post the part where it explains how they accomplished more efficiency without the advanced hardware.

LOL! Man are you trying to not grasp this for the sake of arguing? They either used A100s, as suspected, or what they claim, H800.

Which of those do you consider not advanced?

Cost of an A100 on Ebay

Cost of a H800 on ebay

You see, it's not me claiming they used those. THEY claim it. You guy is wrong and purchased hook line and sinker that they did this for 6 million. He failed to read what the company themselves stated. That 6 million was for the final training run.

Do you know they have to cool that too?

Nvidia A100 and H800 require hardware to attach to. EACH of them. ETC

1 ...

Report Post

Posted by John Barron

The Mar-a-Lago Club

Member since Sep 2024

17101 posts

Posted on 1/27/25 at 5:17 pm to BCreed1

quote:
They either used A100s, as suspected, or what they claim, H800.

That's not what's suspected. That's what the people coping keep wishcasting. They used 2,000 H800 vs the 100,000 A100 that OpenAI uses. The bottom line is that DeepSeek uses fewer resources to get results faster. Imagine running an entire AI system on a fraction of the hardware—less cost, same power.

3 ...

Report Post

Posted by Deacon

Miami, FL

Member since Dec 2009

250 posts

Posted on 1/27/25 at 5:39 pm to John Barron

A pretty good thread that walks through some of the techniques