Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
CuriouslyC
on April 19, 2024
|
parent
|
context
|
favorite
| on:
Llama 3 8B is almost as good as Wizard 2 8x22B
MoE models are harder to fine tune, and they don't solve the biggest problem which is GPU memory use.
Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: