Hacker News new | past | comments | ask | show | jobs | submit login

Bert has an 15% masking rate, seems co-related, also 90% is what works well when you are trying to do label smoothing using entropy minimisation, what's going on!



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: