Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does such a distillation work in theory? They don’t have weights from OpenAI’s models, and can only call their APIs, right? So how can they actually build off of it?


Like RLHF but the HF part is GPT4 instead.


How do you ensure the student model learns robust generalizations rather than just surface-level mimicry?


No idea as I don't work on that, but my guess would be that the higher the 'n' the more model A approaches model B.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: