Evals aside why are American labs not able to release open source models at the ...

ttul · 2025-07-25T14:45:24 1753454724

The Chinese labs can’t compete on inference scale because they have been prevented from accessing the most efficient chips. But since training is a mere fraction of inference these days, they can at least hurt the American companies that are generating billions via inference services.

If you can’t beat ‘em, at least pour some sand into their moat, giving China some time to perfect its own nanometer-scale fabrication. It’s a society-wide effort.

Eisenstein · 2025-07-25T14:47:26 1753454846

They don't release such huge open weights models because people who run open weights don't have the capability to run them effectively. Instead they concentrate on models like Gemma 3 which goes from 1B to 27B, which when quantized fits perfectly into the VRAM you can get on a consumer GPU.

lossolo · 2025-07-25T17:15:30 1753463730

> They don't release such huge open weights models because people who run open weights don't have the capability to run them effectively

This is a naive take. There are multiple firms that can host these models for you, or you can host them yourself by renting GPUs. Thousands of firms could also host open-source models independently. They don’t release them because they fear competition and losing their competitive advantage. If it weren’t for Chinese companies open-sourcing their models, we’d be limited to using closed-source, proprietary models from the U.S., especially considering the recent LLaMA fiasco.

Eisenstein · 2025-07-25T18:44:10 1753469050

Given the assumption that Google has Google's own interests at heart, the question isn't 'why doesn't Google release models that allow other companies to compete with them' but 'what is the reasoning behind the models they release' and that reasoning is 'for research and for people to use personally on their own hardware'.

We should be asking why Meta released the large Llama models and why the Chinese are releasing large models. I can't figure out a reason for it except prestige.

regularfry · 2025-07-25T15:55:33 1753458933

That shouldn't be the case here. Yes, it's memory-bandwidth-limited, but this is an MOE with 22B active. As long as the whole thing fits in RAM, it should be tolerable. It's right at the limit, though.

bugglebeetle · 2025-07-25T15:40:13 1753458013

They could, they’re just greedy, self-serving, and short-sighted. China’s doing the AI equivalent of Belt and Road to reap tremendous strategic advantages, as well as encourage large-scale domestic innovation.