I’m most excited about Qwen-30B-A3B. Seems like a good choice for offline/local-...

kristianp · 2025-04-29T06:00:11 1745906411

It would be interesting to try, but for the Aider benchmark, the dense 32B model scores 50.2 and the 30B-A3B doesn't publish the Aider benchmark, so it may be poor.

estsauver · 2025-04-29T07:12:38 1745910758

Is that Qwen 2.5 or Qwen 3? I don't see a qwen 3 on the aider benchmark here yet: https://aider.chat/docs/leaderboards/

aitchnyu · 2025-04-29T08:21:15 1745914875

As a human who asks AI to edit upto 50 SLOC at a time, is there value in models which score less than 50%? Im using the `gemini-2.0-flash-001` though.

manmal · 2025-04-29T09:28:34 1745918914

The aider score mentioned in GP was published by Alibaba themselves, and is not yet on aider's leaderboard. The aider team will probably do their own tests and maybe come up with a different score.

htsh · 2025-04-28T22:03:51 1745877831

curious, why the 30b MoE over the 32b dense for local coding?

I do not know much about the benchmarks but the two coding ones look similar.

Casteil · 2025-04-28T22:10:06 1745878206

The MoE version with 3b active parameters will run significantly faster (tokens/second) on the same hardware, by about an order of magnitude (i.e. ~4t/s vs ~40t/s)

genpfault · 2025-04-28T23:06:14 1745881574

> The MoE version with 3b active parameters

~34 tok/s on a Radeon RX 7900 XTX under today's Debian 13.

tgtweak · 2025-04-29T02:21:39 1745893299

And vmem use?

genpfault · 2025-04-29T13:30:43 1745933443

~18.6 GiB, according to nvtop.

ollama 0.6.6 invoked with:

    # server
    OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 ollama serve

    # client
    ollama run --verbose qwen3:30b-a3b

~19.8 GiB with:

    /set parameter num_ctx 32768

tgtweak · 2025-04-29T13:58:19 1745935099

Very nice, should run nicely on a 3090 as well.

TY for this.

update: wow, it's quite fast - 70-80t/s on LM Studio with a few other applications using GPU.

esafak · 2025-04-28T21:53:46 1745877226

Could this variant be run on a CPU?

moconnor · 2025-04-28T22:01:20 1745877680

Probably very well