Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
samus
on April 5, 2024
|
parent
|
context
|
favorite
| on:
JetMoE: Reaching LLaMA2 performance with 0.1M doll...
You're only correct about Qwen's MoE. I presume that Chinese model builders feel more pressure to be efficient about using their GPU time because of sanctions.
Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: