Oh the repetition issue is only on the non dynamic quants :) If you do dynamic q...

smcleod · 2025-01-28T12:03:02 1738065782

min_p is great, do you apply a small amount of temperate as well?

Der_Einzige · 2025-01-28T20:13:37 1738095217

Btw, min_p (the paper about the sampler) got accepted to ICLR! As 4th author it warms my heart to so it used so much in the wild.

danielhanchen · 2025-01-28T20:45:05 1738097105

Oh hi!! Congratulations on ICLR!!! min_p = 0.1 and temp = 1.5 is my default goto settings!!

danielhanchen · 2025-01-28T19:46:32 1738093592

The recommended temperature from DeepSeek is 0.6 so I leave it at that!

smcleod · 2025-01-28T21:01:05 1738098065

I think most of the model creators share their model usage examples so high at 0.6-0.7 simply because it's what a lot of the client apps use. IMO this is WAY too high unless you're doing creative writing.

Generally I set temp to 0-0.4 at absolute most.

min_p actually needs a little temperature to work effectively so with min_p I almost always use 0.2

danielhanchen · 2025-01-28T21:15:58 1738098958

Ye lower temp is also good :) Tbh its all trial and error - I found temp=1.5, min_p=0.1 to be very useful for pass@k type workloads - ie calling the LLM multiple times and aggregating.

temp=0 is also good for singular outputs. For classification tasks, it's better to actually inspect the logits.

But my goto setting is always setting min_p at least 0.01 or 0.05! It vastly suppresses incorrect rare random tokens from being created, and it helps massively!