Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can use structured generation instead of fiddling with the prompt, which is unreliable. https://github.com/outlines-dev/outlines


Does this Python package control the LLMs using something other than text? Or is the end result still that that Python package wraps your prompt with additional text containing additional instructions that become part of the prompt itself?


Looks like it actually changes how you do token generation to conform to a given context-free grammar. It's a way to structure how you sample from the model rather than a tweak to the prompt, so it's more efficient and guarantees that the output matches the formal grammar.

There's a reference to the paper that describes the method at the bottom of the README: https://arxiv.org/pdf/2307.09702


The output of the LLM is not just one token, but a statistical distribution across all possible output tokens. The tool you use to generate output will sample from this distribution with various techniques, and you can put constraints on it like not being too repetitive. Some of them support getting very specific about the allowed output format, e.g. https://github.com/ggerganov/llama.cpp/blob/master/grammars/... So even if the LLM says that an invalid token is the most likely next token, the tool will never select it for output. It will only sample from valid tokens.


No it limits what tokens the LLM can output. The output is guaranteed to follow the schema.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: