Was it via gemma.cpp or some other library? I've seen a few people note that gem...

brucethemoose2 · on Feb 23, 2024

I eval'd it with vllm.

One thing I do suspect people are running into is sampling issues. Gemma probably doesn't like llama defaults with its 256K vocab.

Many Chinese llms have a similar "default sampling" issue.

But our testing was done with zero temperature and constrained single token responses, so that shouldnt be an issue.