Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
blackeyeblitzar
6 months ago
|
parent
|
context
|
favorite
| on:
The Illustrated DeepSeek-R1
How does such a distillation work in theory? They don’t have weights from OpenAI’s models, and can only call their APIs, right? So how can they actually build off of it?
moralestapia
6 months ago
[–]
Like RLHF but the HF part is GPT4 instead.
KarraAI
6 months ago
|
parent
[–]
How do you ensure the student model learns robust generalizations rather than just surface-level mimicry?
moralestapia
6 months ago
|
root
|
parent
[–]
No idea as I don't work on that, but my guess would be that the higher the 'n' the more model A approaches model B.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: