More

YetAnotherNick · 2025-08-08T21:04:39 1754687079

Deepseek main run costed $6M. qwen3-30b-a3b probably would cost few $100Ks, which is ranked 13th.

GPU cost of the final model training isn't the biggest chunk of the cost and you can probably replicate results of models like Llama 3 very cheaply. It's the cost of experiments, researchers, data collection which brings overall cost 1 or 2 order of magnitude higher.

ilaksh · 2025-08-08T21:32:04 1754688724

What's your source for any of that? I think the $6 million thing was identified as a lie they felt was necessary because of GPU export laws.

YetAnotherNick · 2025-08-08T22:30:19 1754692219

It wasn't a lie, it was a misrepresentation of the total cost. It's not hard to calculate the cost of the training though. It takes 6 * active parameters * tokens flops[1]. To get number of seconds you can divide by Flops/s * MFU, where MFU is around 45% for H100 for large enough models[2].

[1]: https://arxiv.org/abs/2001.08361

[2]: https://github.com/facebookresearch/lingua

CamperBob2 · 2025-08-09T02:40:10 1754707210

That paper's 5 years old at this point, dating back to when Amodei was still an OpenAI employee. Has any newer work superseded it, or are those assumptions still considered solid?

YetAnotherNick · 2025-08-09T08:36:57 1754728617

Those assumptions are still the same. Although now context length has increased more so the n^2 part is non negligible. See the repo for correct flop calculation[1]

[1]: https://github.com/facebookresearch/lingua/blob/437d680e5218...

YetAnotherNick · 2025-08-05T19:02:01 1754420521

> Not buy TSMC made chips? Obviously that is impossible

Why do you think that? Trump clearly wants them to use Intel's 18A which is likely similar to TSMC 2 year old N3P, which is not an impossible option.

cherryteastain · 2025-08-05T19:16:13 1754421373

The same 18A that their CEO is axing?

https://www.reuters.com/business/retail-consumer/intels-new-...

bigbadfeline · 2025-08-05T22:26:17 1754432777

That's 14A, not 18A, and it's not being axed, but questioned in what appears to be a game of chicken.

cherryteastain · 2025-08-06T08:20:24 1754468424

Sounds like they are axing 18A for external customers, which is what Intel needs to be an alternative to TSMC:

>Tan's overall strategy for Intel remains nascent...Shifting away from selling 18A to foundry customers would represent one of his biggest moves yet.

tester756 · 2025-08-05T21:58:40 1754431120

18A is not getting axed, they'll use it.

fh973 · 2025-08-05T19:21:00 1754421660

Them? Nvidia and AMD? RTX 7xxx would then be based on an old Intel process? Would buy these?

MBCook · 2025-08-05T21:22:35 1754428955

All of Apple’s stuff?

even if Intel’s processes worked just as well, there’s no way they have the capacity to take over for all that stuff.

We’d be back in a huge shortage.

YetAnotherNick · 2025-08-05T19:31:59 1754422319

I would buy that if I could get it in 30% lower price.

Rohansi · 2025-08-05T19:35:30 1754422530

Just buy hardware that's a few years old and you'll basically get the same thing.

YetAnotherNick · 2025-08-05T07:35:56 1754379356

> developers and product managers mistakenly believing that I care as much about their product as they do

Isn't it opposite. They know you don't care and try to spam you to care or remember to use them. It is like an advertisement for self.

YetAnotherNick · 2025-07-30T15:35:35 1753889735

All of these is true for Stephen Hawking. Of course one can argue that he couldn't have done it if not for other humans making technology for him.

AIPedant · 2025-07-30T15:50:34 1753890634

The big difference with Stephen Hawking is that he was not born disabled, he became disabled during graduate school. Even in 2025, a human who is born blind and severely paralyzed (so they cannot speak or sign) will probably never learn calculus, regardless of innate ability. Perhaps in the medium term technology will improve.

That said, another major difference is psychology. Switching animals, it seems plausible to me that chimpanzees are theoretically capable of doing basic calculus as a matter of pattern-matching. But you can't force them to study it! Basic calculus is too tedious and high-effort to learn for a mere banana, you need something truly valuable like "guaranteed admission to the flagship state university" for human children to do it. But we don't have an equivalent offer for chimps. (Likewise an Isaac Newton - level dog might still find calculus exceptionally boring compared to chasing squirrels.)

tim333 · 2025-07-30T22:44:43 1753915483

Also LLMs are lacking on the eye, paw and throat front but still do better in the math olympiad than dogs, or me for that matter.

Runsthroughit · 2025-07-30T22:52:42 1753915962

OMG. Helen Keller. If examples help you people. Prob not.

AIPedant · 2025-07-30T23:40:16 1753918816

Was Helen Keller severely paralyzed????

  a human who is born blind and severely paralyzed (so they cannot speak or sign)

YetAnotherNick · 2025-07-30T13:30:34 1753882234

> If my blog photo becomes deep fake porn

Depends. In most cases, this thing is forbidden by law and you can claim actual damages.

kldg · 2025-07-30T15:28:28 1753889308

That's helpful if they live in the same country, can figure out who the 4chan poster was, the police are interested (or you want to risk paying a lawyer), you're willing to sink the time pursuing such action (and if criminal, risk adversarial LEO interaction), and are satisfied knowing hundreds of others may be doing the same and won't be deterred. Of course, friends and co-workers are too close to you to post publicly when they generate it. Thankfully, the Taylor Swift laws in the US have stopped generation of nonconsensual imagery and video of its namesake (it hasn't).

Daughter's school posted pictures of her online without an opt-out, but she's also on Facebook from family members and it's just kind of... well beyond the point of trying to suppress. Probably just best to accept people can imagine you naked, at any age, doing any thing. What's your neighbor doing with the images saved from his Ring camera pointed at the sidewalk? :shrug:

YetAnotherNick · 2025-07-30T19:27:08 1753903628

I am not talking about 4chan poster. I am talking if a company does it.

YetAnotherNick · 2025-07-30T13:26:50 1753882010

That's a bad analogy. Most people including me do expect that their "public" data is used for AI training. I mean based on the ads everyone gets, most people know and expect completely well that anything they post online would be used in AI.

malfist · 2025-07-30T18:54:01 1753901641

Are you trying to argue that 10 years ago when I uploaded my resume to linkedin, that I should have known it'd be used for AI training?

Or that teenager that signed up to facebook should know that the embarrassing things they're posting is going to train AI and is, as you called it, public?

What about the blog I started 25 years ago and then took down but it lives in the geocities archive. Was I supposed to know it'd go to an AI overlord corporation when I was in middle school writing about dragon photos I found on google?

And we're not even getting into data breaches, or something that was uploaded as private and then sold when the corporation changed their privacy policy decades after it was uploaded.

It's not a bad analogy when you don't give all the graces to corporations and none to the exploited.

danielmarkbruce · 2025-07-31T01:28:36 1753925316

"Corporations".... you gave access to the whole world, including criminals.

JohnFen · 2025-07-30T13:36:41 1753882601

> Most people including me do expect that their "public" data is used for AI training.

Based on what ordinary people have been saying, I don't think this is true. Or, maybe it's true now that the cat is out of the bag, but I don't think most people expected this before.

Most tech-oriented people did, of course, but we're a small minority. And even amongst our subculture, a lot of people didn't see this abuse coming. I didn't, or I would have removed all of my websites from the public web years earlier than I did.

YetAnotherNick · 2025-07-30T13:55:26 1753883726

> Most tech-oriented people did

In fact it's the opposite. People who aren't into tech thinks Instagram is listening to them 24*7 to show feed and ads. There was even a hoax near my area among elderly groups that Whatsapp is using profile photo in illegal activity and many people removed their photo one time.

> I didn't, or I would have removed all of my websites from the public web years earlier than I did.

Your comment is public information. In fact posting anything in HN is a sure shot way to giving your content for AI training.

JohnFen · 2025-07-30T14:12:52 1753884772

> People who aren't into tech thinks Instagram is listening to them 24*7 to show feed and ads

True, but that's a world different than thinking that your data will be used to train genAI.

> In fact posting anything in HN is a sure shot way to giving your content for AI training.

Indeed so, but HN seems to be a bad habit I just can't kick. However, my comments here are the entirety of what I put up on the open web and I intentionally keep them relatively shallow. I no longer do long-form blogging or make any of my code available on the open web.

However, you're right. Leaving HN is something that I need to do.

pbhjpbhj · 2025-08-03T10:17:40 1754216260

>based on the ads everyone get

I'm not sure what you mean here? In context I suspect you mean 'because ads were chosen from a perspective of knowledge about you'?? But that's really opposite my experience (UK).

Ads now go hard on brainwashing. Same advert over-and-over, almost never anything I want to buy.

YouTube suggestions are pretty much inline with my previous viewing though.

My ISP has a list of every domain I connect to, my streaming providers know every video we watch, the supermarkets and credits companies know every item we buy at the shops, but still the brainwashing attempts continue for things we'd simply never buy.

bearl · 2025-07-30T13:42:28 1753882948

No, the average person has no idea what “ai training” even is. Should the average person have an above average iq? Yes. Could they? No. Don’t be average yourself.

victorbjorklund · 2025-07-30T20:23:02 1753906982

Seriously, when YOU posted something on the Internet 20 years ago you expected it to be used by a corporation to train an AI 20 years later?

erikerikson · 2025-07-31T15:00:56 1753974056

Data sourcing has been a discussion, at least in AI circles, for much longer than 20 years.

So if you are asking me, I would have to say yes. I cannot speak for the original poster.

YetAnotherNick · 2025-07-29T08:01:12 1753776072

> 39% in Gaza supported the attacks by Hamas into Israel in October 2023 that triggered the conflict, 32 percentage points lower than six months earlier[1].

If 71% civilian supports some group, then it is not a terrorist group but a government, and using Gazans isn't an overreach.

[1]: https://www.theguardian.com/world/2025/feb/11/amid-the-cease...

karim79 · 2025-07-31T00:17:16 1753921036

So saying then >50% Israelis (conservatively) support starvation of Gazans is not an overreach.

Also support for one group does not imply that all supporters take up arms in solidarity with said group.

YetAnotherNick · 2025-07-25T10:05:52 1753437952

> GeoGuesser's entire dataset

No, it is not included, however there must be quite a lot of pictures on internet for most cities.. Geoguesser data is same as Google's street view data and it probably contains billions of 360 degree photos.

ivape · 2025-07-25T10:17:45 1753438665

I just saw a video on Reddit where a woman still managed to take a selfie while being literally face to face with a black bear. There’s definitely way too much video training data out there for everything.

lutusp · 2025-07-25T17:46:02 1753465562

> I just saw a video on Reddit where a woman still managed to take a selfie while being literally face to face with a black bear.

This is not uncommon. Bears aren't always tearing people apart, that's a movie trope with little connection to reality. Black bears in particular are smart and social enough to befriend their food sources.

But a hungry bear, or a bear with cubs, that's a different story. Even then bears may surprise you. Once in Alaska, a mama bear got me to babysit her cubs while she went fishing -- link: https://arachnoid.com/alaska2018/bears.html .

suddenlybananas · 2025-07-25T10:33:20 1753439600

Why do you say it's not included? Why wouldn't they include it.

sebzim4500 · 2025-07-25T12:38:37 1753447117

If every photo in streetview was included in the training data of a multimodal LLM it would be like 99.9999% of the training data/resource costs.

It just isn't plausible that anyone has actually done that. I'm sure some people include a small sample of them, though.

bluefirebrand · 2025-07-25T13:07:40 1753448860

Why would every photo in streetview be required in order to have Geoguessr's dataset in the training data?

bee_rider · 2025-07-25T14:59:19 1753455559

I’m pretty sure they are saying that Geoguessr's just pulls directly from Google Streetview. There isn’t a separate Geoguessr dataset, it just pulls from Google’s API (at least that’s what Wikipedia says).

bluefirebrand · 2025-07-25T15:38:35 1753457915

I suspect that Geoguessr's dataset is a subset of Google Streetview, but maybe it really is just pulling everything directly

bee_rider · 2025-07-25T17:13:35 1753463615

My guess would be that they pull directly from street-view, maybe with some extra filtering for interesting locations.

Why bother to create a copy, if it can be avoided, right?

clbrmbr · 2025-07-26T02:11:57 1753495917

Yet.

This is a good rebuttal when someone quips that we “are about to run out of data”. There’s oh so much more, just not in the form of books and blogs.

YetAnotherNick · 2025-07-21T06:20:02 1753078802

> Coding activities should be performed mostly with:

> * Gemini 2.5 PRO > * Claude Opus 4

I think trying out all the LLMs for each task is highly underappreciated. There is no pareto optimal LLM for all skills. I give each prompt to 8 different LLMs using a Mac app. In my experience while Gemini is consistently in top 3 of 8, the difference between best output and Gemini Pro could be huge.

YetAnotherNick · 2025-07-16T05:41:11 1752644471

I think that's a good decision. They know their market and it's intended for small projects and demo mostly from non tech people. And they did not built half baked editor for which people would have further complaints about. AI assisted coding is a whole different thing and there are many players.