Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

MUCH less training for SLIGHTLY worse results. It's a huge benefit to be able to make this trade-off.


Is the reverse also true? If you have the training data necessary for "good" results on GPT-2, is it generally correct to assume that it would provide better results on your task than GPT-3?


If you can answer this question without running both models over the data set, you've got a very good paper on your hands.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: