Yeah I know, hence its odd I found it kind of dumb for personal use. Moreso with the smaller models, which lost an objective benchmark I have to some Mistral finetunes.
And I don't think I was using it wrong. I know, for instance, the Chinese language models are funny about sampling since I run Yi all the time.