You need LLM Ops. YC happens to have invested in Langfuse, which is if you're serious about tracking metrics, you'll appreciate the rest, too.
And before you ask: yes, for cached content and batch completion discounts you can accommodate both—just needs a bit of logic in your completion-layer code.
And before you ask: yes, for cached content and batch completion discounts you can accommodate both—just needs a bit of logic in your completion-layer code.