Hacker News new | past | comments | ask | show | jobs | submit login

The paper shows that the speed is comparable to transformer models, faster with smaller with "long" sequence length like 8k.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: