Hacker News new | past | comments | ask | show | jobs | submit login

> Chinchilla scaling was only good for academics

I don't know if it's only good of academics, the point as the paper (as it says) is a scaling law for optimal loss given a fixed compute budget. By design it doesn't address inference costs and isn't a recipe for "how you should train a LLM for your use case".

If you're serving LLMs in a low throughput high cost scenario optimizing loss at the expense of inference cost may very well be your goal, or if you cant pay up front for 25x compute.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: