You can, I was able to run 13B on my 16GB 8c8g M1 Air. The performance was 2-3 t...

		mrtksn on March 11, 2023 \| parent \| context \| favorite \| on: Llama.cpp: Port of Facebook's LLaMA model in C/C++... You can, I was able to run 13B on my 16GB 8c8g M1 Air. The performance was 2-3 tokens/second. It felt on par with ChatGPT on a busy day.