You're not "interacting with a language model", you're running a program (llama.cpp) with a sampling algorithm which is not set to maximum factualness by default.
It's like how you have to set x264 to the anime tuning or the film tuning depending on what you run it on.
It's like how you have to set x264 to the anime tuning or the film tuning depending on what you run it on.