Hacker News new | past | comments | ask | show | jobs | submit | from login
Calculating the cost of a Google DeepMind paper (152334h.github.io)
303 points by 152334H 6 months ago | past | 150 comments
Knowing Enough About MoE to Explain Dropped Tokens in GPT-4 (152334h.github.io)
3 points by 152334H on Aug 8, 2023 | past | 1 comment
Non-determinism in GPT-4 is caused by Sparse MoE (152334h.github.io)
397 points by 152334H on Aug 4, 2023 | past | 181 comments
Why can't TorToiSe be fine-tuned? (152334h.github.io)
1 point by 152334H on Feb 11, 2023 | past

Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: