Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, they could be using any of the variants of pointwise scalar trig-style embedding, one imagines it's at least a little custom to their particular training setup.

It was just an example of a modern positional encoding. I regret that I implied inside knowledge about that level of detail. They're doing something clever on scalar pointwise positional encoding but as for what who knows.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: