Not literally infinite, but Llama2 scale models can handle about 10 trillion tok...

		gdiamos on Feb 8, 2024 \| parent \| context \| favorite \| on: How we got fine-tuning Mistral-7B to not suck Not literally infinite, but Llama2 scale models can handle about 10 trillion tokens.