Cool work! Correct me if I'm wrong, but I believe to use the new OpenAI structur...

danso · 2024-10-14T18:59:30 1728932370

I may be reading the documentation wrong [0], but I think if you specify `json_schema`, you actually have to provide a schema. I get this error when I do `response_format={"type": "json_schema"}`:

     openai.BadRequestError: Error code: 400 - {'error': {'message': "Missing required parameter: 'response_format.json_schema'.", 'type': 'invalid_request_error', 'param': 'response_format.json_schema', 'code': 'missing_required_parameter'}}

I hadn't used OpenAI for data extraction before the announcement of Structured Outputs, so not sure if `type: json_object` did something different before. But supplying only it as the response format seems to be the (low effort) way to have the API infer the structure on its own

[0] https://platform.openai.com/docs/guides/structured-outputs/s...

ec109685 · 2024-10-14T18:34:17 1728930857

I’ve been using jsonschema since forever with function calling. Does structured output just formalize things?

chaos_emergent · 2024-10-14T21:13:42 1728940422

function calling provides a "hint" in the form of a JSON schema for an LLM to follow. the models are trained to follow provided schemas. If you have really complicated or deeply nested models, they can become less stable at generating schema-conformant JSON.

Structured outputs apply a context-free grammar to the prediction generation so that, for each token generation, only tokens that generate a perfectly conformant JSON schema are considered.

The benefit of doing this is predictability, but there's a trade-off in prediction stability; apparently structured output can constrain the model to generate in a way that takes it off the "happy path" of how it assumes text should be generated.

Happy to link you to some papers I've skimmed on it if you're interested!

pmg0 · 2024-10-15T01:46:48 1728956808

Could you share some of those papers? I had a great discussion with Marc Fischer from the LMQL team [0] on this topic while at ICML earlier this year. Their work recommended decoding to natural language templates with mad lib-style constraints to follow that “happy path” you refer to, instead of decoding to a (relatively more specific latent) JSON schema [1]. Since you provided a template and knew the targeted tokens for generation you could strip your structured content out of the message. This technique also allowed for beam search where you can optimize tokens which lead to the tokens contain your expected strings, avoiding some weird token concatenation process. Really cool stuff!

[0] https://lmql.ai/ [1] https://arxiv.org/abs/2311.04954

throwup238 · 2024-10-14T18:55:25 1728932125

Structured output uses "constrained decoding" under the hood. They convert the JSON schema to a context free grammar so that when the model samples tokens, invalid tokens are masked to have a probability of zero. It's much less likely to go off the rails.