Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The larger model (235b) on chat produced rather an impressive answer on a small coding task I gave it. But Qwen-30B-A3B gave a result for the same task worse than Qwen 2.5 does.

"Write a Golang program that merges huge presorted text files, just like sort -m does". Quite often models need "use heap" as guidance, but this time big model figured it out by itself.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: