Deepseek main run costed $6M. qwen3-30b-a3b probably would cost few $100Ks, which is ranked 13th.
GPU cost of the final model training isn't the biggest chunk of the cost and you can probably replicate results of models like Llama 3 very cheaply. It's the cost of experiments, researchers, data collection which brings overall cost 1 or 2 order of magnitude higher.
It wasn't a lie, it was a misrepresentation of the total cost. It's not hard to calculate the cost of the training though. It takes 6 * active parameters * tokens flops[1]. To get number of seconds you can divide by Flops/s * MFU, where MFU is around 45% for H100 for large enough models[2].
That paper's 5 years old at this point, dating back to when Amodei was still an OpenAI employee. Has any newer work superseded it, or are those assumptions still considered solid?
Those assumptions are still the same. Although now context length has increased more so the n^2 part is non negligible. See the repo for correct flop calculation[1]
The big difference with Stephen Hawking is that he was not born disabled, he became disabled during graduate school. Even in 2025, a human who is born blind and severely paralyzed (so they cannot speak or sign) will probably never learn calculus, regardless of innate ability. Perhaps in the medium term technology will improve.
That said, another major difference is psychology. Switching animals, it seems plausible to me that chimpanzees are theoretically capable of doing basic calculus as a matter of pattern-matching. But you can't force them to study it! Basic calculus is too tedious and high-effort to learn for a mere banana, you need something truly valuable like "guaranteed admission to the flagship state university" for human children to do it. But we don't have an equivalent offer for chimps. (Likewise an Isaac Newton - level dog might still find calculus exceptionally boring compared to chasing squirrels.)
That's helpful if they live in the same country, can figure out who the 4chan poster was, the police are interested (or you want to risk paying a lawyer), you're willing to sink the time pursuing such action (and if criminal, risk adversarial LEO interaction), and are satisfied knowing hundreds of others may be doing the same and won't be deterred. Of course, friends and co-workers are too close to you to post publicly when they generate it. Thankfully, the Taylor Swift laws in the US have stopped generation of nonconsensual imagery and video of its namesake (it hasn't).
Daughter's school posted pictures of her online without an opt-out, but she's also on Facebook from family members and it's just kind of... well beyond the point of trying to suppress. Probably just best to accept people can imagine you naked, at any age, doing any thing. What's your neighbor doing with the images saved from his Ring camera pointed at the sidewalk? :shrug:
That's a bad analogy. Most people including me do expect that their "public" data is used for AI training. I mean based on the ads everyone gets, most people know and expect completely well that anything they post online would be used in AI.
Are you trying to argue that 10 years ago when I uploaded my resume to linkedin, that I should have known it'd be used for AI training?
Or that teenager that signed up to facebook should know that the embarrassing things they're posting is going to train AI and is, as you called it, public?
What about the blog I started 25 years ago and then took down but it lives in the geocities archive. Was I supposed to know it'd go to an AI overlord corporation when I was in middle school writing about dragon photos I found on google?
And we're not even getting into data breaches, or something that was uploaded as private and then sold when the corporation changed their privacy policy decades after it was uploaded.
It's not a bad analogy when you don't give all the graces to corporations and none to the exploited.
> Most people including me do expect that their "public" data is used for AI training.
Based on what ordinary people have been saying, I don't think this is true. Or, maybe it's true now that the cat is out of the bag, but I don't think most people expected this before.
Most tech-oriented people did, of course, but we're a small minority. And even amongst our subculture, a lot of people didn't see this abuse coming. I didn't, or I would have removed all of my websites from the public web years earlier than I did.
In fact it's the opposite. People who aren't into tech thinks Instagram is listening to them 24*7 to show feed and ads. There was even a hoax near my area among elderly groups that Whatsapp is using profile photo in illegal activity and many people removed their photo one time.
> I didn't, or I would have removed all of my websites from the public web years earlier than I did.
Your comment is public information. In fact posting anything in HN is a sure shot way to giving your content for AI training.
> People who aren't into tech thinks Instagram is listening to them 24*7 to show feed and ads
True, but that's a world different than thinking that your data will be used to train genAI.
> In fact posting anything in HN is a sure shot way to giving your content for AI training.
Indeed so, but HN seems to be a bad habit I just can't kick. However, my comments here are the entirety of what I put up on the open web and I intentionally keep them relatively shallow. I no longer do long-form blogging or make any of my code available on the open web.
However, you're right. Leaving HN is something that I need to do.
I'm not sure what you mean here? In context I suspect you mean 'because ads were chosen from a perspective of knowledge about you'?? But that's really opposite my experience (UK).
Ads now go hard on brainwashing. Same advert over-and-over, almost never anything I want to buy.
YouTube suggestions are pretty much inline with my previous viewing though.
My ISP has a list of every domain I connect to, my streaming providers know every video we watch, the supermarkets and credits companies know every item we buy at the shops, but still the brainwashing attempts continue for things we'd simply never buy.
No, the average person has no idea what “ai training” even is. Should the average person have an above average iq? Yes. Could they? No. Don’t be average yourself.
> 39% in Gaza supported the attacks by Hamas into Israel in October 2023 that triggered the conflict, 32 percentage points lower than six months earlier[1].
If 71% civilian supports some group, then it is not a terrorist group but a government, and using Gazans isn't an overreach.
No, it is not included, however there must be quite a lot of pictures on internet for most cities.. Geoguesser data is same as Google's street view data and it probably contains billions of 360 degree photos.
I just saw a video on Reddit where a woman still managed to take a selfie while being literally face to face with a black bear. There’s definitely way too much video training data out there for everything.
> I just saw a video on Reddit where a woman still managed to take a selfie while being literally face to face with a black bear.
This is not uncommon. Bears aren't always tearing people apart, that's a movie trope with little connection to reality. Black bears in particular are smart and social enough to befriend their food sources.
But a hungry bear, or a bear with cubs, that's a different story. Even then bears may surprise you. Once in Alaska, a mama bear got me to babysit her cubs while she went fishing -- link: https://arachnoid.com/alaska2018/bears.html .
I’m pretty sure they are saying that Geoguessr's just pulls directly from Google Streetview. There isn’t a separate Geoguessr dataset, it just pulls from Google’s API (at least that’s what Wikipedia says).
> Coding activities should be performed mostly with:
> * Gemini 2.5 PRO
> * Claude Opus 4
I think trying out all the LLMs for each task is highly underappreciated. There is no pareto optimal LLM for all skills. I give each prompt to 8 different LLMs using a Mac app. In my experience while Gemini is consistently in top 3 of 8, the difference between best output and Gemini Pro could be huge.
I think that's a good decision. They know their market and it's intended for small projects and demo mostly from non tech people. And they did not built half baked editor for which people would have further complaints about. AI assisted coding is a whole different thing and there are many players.
GPU cost of the final model training isn't the biggest chunk of the cost and you can probably replicate results of models like Llama 3 very cheaply. It's the cost of experiments, researchers, data collection which brings overall cost 1 or 2 order of magnitude higher.
reply