Hacker News new | past | comments | ask | show | jobs | submit login

MoE models are harder to fine tune, and they don't solve the biggest problem which is GPU memory use.



Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: