Hacker News new | past | comments | ask | show | jobs | submit login

more early impressions on performance: besides the endpoint erroring out at a higher rate than openai, time-to-first-token is also much slower :(

p50: 2.14s p95: 3.02s

And these aren't super long prompts either. vs gpt4 ttft:

p50: 0.63s p95: 1.47s




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: