Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This ignores batching - token generation is much more efficient in batch - and I strongly suspect is itself written by AI, given the heavy use of bullets


The “X—not Y” pattern is also a dead giveaway.


is it common for adjacent tokens to use the same weights in a memory cache?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: